[OpenSIPS-Devel] OpenSIPS Crash

Bogdan-Andrei Iancu bogdan at opensips.org
Thu Nov 15 11:08:10 EST 2018


Hi Ben,

DO you have the backtraces from more similar crashes ? may there is 
apattern there.

Regards,

Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
   http://www.opensips-solutions.com
OpenSIPS Bootcamp 2018
   http://opensips.org/training/OpenSIPS_Bootcamp_2018/

On 11/15/2018 05:01 PM, Ben Newlin wrote:
>
> Bogdan,
>
> It’s happening every few days, so it is pretty frequent. There was 
> another one yesterday but the DBG compile flags had been temporarily 
> removed for that one.
>
> We have not been able to determine a sequence to reproduce it yet.
>
> Ben Newlin
>
> *From: *Bogdan-Andrei Iancu <bogdan at opensips.org>
> *Date: *Thursday, November 15, 2018 at 7:06 AM
> *To: *Ben Newlin <Ben.Newlin at genesys.com>, OpenSIPS devel mailling 
> list <devel at lists.opensips.org>
> *Subject: *Re: [OpenSIPS-Devel] OpenSIPS Crash
>
> Hi Ben,
>
> How often this crash happens ? are you able to reproduce it ?
>
> The acc extra should work in the branch route, no problem. Out of 
> curiosity, I will try to reproduce you case (timeout -> failure route 
> -> t_relay -> branch_route) to see if I can reproduce it.
>
> Regards,
>
> Bogdan-Andrei Iancu
> OpenSIPS Founder and Developer
>    http://www.opensips-solutions.com
> OpenSIPS Bootcamp 2018
>    http://opensips.org/training/OpenSIPS_Bootcamp_2018/
>
> On 11/13/2018 07:41 PM, Ben Newlin wrote:
>
>     Bogdan,
>
>     Yes, we are setting acc_extra variables in our branch routes,
>     which are sometimes (but not always) called from failure route.
>     Are acc_extra variables not available for use in branch_routes?
>
>     We don’t currently use drop_accounting anywhere in our script. If
>     I call it before that branch_route then it will stop accounting
>     for that call, right? We need to have accounting records for the
>     call, so I’m not sure how that would resolve the issue?
>
>     Ben Newlin
>
>     *From: *Bogdan-Andrei Iancu <bogdan at opensips.org>
>     <mailto:bogdan at opensips.org>
>     *Date: *Tuesday, November 13, 2018 at 9:13 AM
>     *To: *Ben Newlin <Ben.Newlin at genesys.com>
>     <mailto:Ben.Newlin at genesys.com>, OpenSIPS devel mailling list
>     <devel at lists.opensips.org> <mailto:devel at lists.opensips.org>
>     *Subject: *Re: [OpenSIPS-Devel] OpenSIPS Crash
>
>     Hi Ben,
>
>     Thanks for the info. The crash happens when you try to set an acc
>     extra variable in branch route (when a creating a new branch via
>     failure route, on timeout).
>
>     Now, do you use the drop accounting in your script ? and
>     considering the above scenario, it is possible to have the drop
>     acc before the branch route ?
>
>     Regards,
>
>
>
>     Bogdan-Andrei Iancu
>
>       
>
>     OpenSIPS Founder and Developer
>
>        http://www.opensips-solutions.com
>
>     OpenSIPS Bootcamp 2018
>
>        http://opensips.org/training/OpenSIPS_Bootcamp_2018/
>
>     On 11/12/2018 08:55 PM, Ben Newlin wrote:
>
>         Bogdan,
>
>         We upgraded to 2.4.3 and the crash reproduced today. Backtrace
>         is available here: https://pastebin.com/CZxQnZdR
>         <https://pastebin.com/CZxQnZdR>.
>
>         Ben Newlin
>
>         *From: *Bogdan-Andrei Iancu <bogdan at opensips.org>
>         <mailto:bogdan at opensips.org>
>         *Date: *Wednesday, November 7, 2018 at 6:18 AM
>         *To: *OpenSIPS devel mailling list <devel at lists.opensips.org>
>         <mailto:devel at lists.opensips.org>, Ben Newlin
>         <Ben.Newlin at genesys.com> <mailto:Ben.Newlin at genesys.com>
>         *Subject: *Re: [OpenSIPS-Devel] OpenSIPS Crash
>
>         Hi Ben,
>
>         The BT indicates a double free for the accounting context -
>         and I noticed you use 2.4.1 version. And yes, there was an
>         issue related to acc context, issue that was fixed starting
>         2.4.2. So, could you upgrade to the latest 2.4 and see if the
>         crash still happens ? As I think the fix is already there.
>
>         Regards,
>
>
>
>         Bogdan-Andrei Iancu
>
>           
>
>         OpenSIPS Founder and Developer
>
>            http://www.opensips-solutions.com
>
>         OpenSIPS Bootcamp 2018
>
>            http://opensips.org/training/OpenSIPS_Bootcamp_2018/
>
>         On 11/06/2018 11:13 PM, Bogdan-Andrei Iancu wrote:
>
>             Jackpot - you get it right !! I will start digging into
>             the trace, but please keep the corefile, I might need it
>             later.
>
>             Thanks and regards,
>
>
>
>             Bogdan-Andrei Iancu
>
>               
>
>             OpenSIPS Founder and Developer
>
>                http://www.opensips-solutions.com
>
>             OpenSIPS Bootcamp 2018
>
>                http://opensips.org/training/OpenSIPS_Bootcamp_2018/
>
>             On 11/06/2018 10:24 PM, Ben Newlin wrote:
>
>                 Bogdan,
>
>                 I have reproduced this crash and verified this time
>                 that the flags were set.
>
>                 $ opensips -V
>
>                 version: opensips 2.4.1 (x86_64/linux)
>
>                 flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP,
>                 PKG_MALLOC, QM_MALLOC, DBG_MALLOC,
>                 FAST_LOCK-ADAPTIVE_WAIT, DBG_LOCK
>
>                 ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144,
>                 MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535
>
>                 poll method support: poll, epoll, sigio_rt, select.
>
>                 git revision: 5d042cffc
>
>                 main.c compiled on 23:38:55 Nov 5 2018 with gcc 7
>
>                 Backtrace is available here:
>                 https://pastebin.com/KTQjkCwq
>                 <https://pastebin.com/KTQjkCwq>
>
>                 Ben Newlin
>
>                 *From: *Bogdan-Andrei Iancu <bogdan at opensips.org>
>                 <mailto:bogdan at opensips.org>
>                 *Date: *Thursday, November 1, 2018 at 1:19 PM
>                 *To: *Ben Newlin <Ben.Newlin at genesys.com>
>                 <mailto:Ben.Newlin at genesys.com>, OpenSIPS devel
>                 mailling list <devel at lists.opensips.org>
>                 <mailto:devel at lists.opensips.org>
>                 *Subject: *Re: [OpenSIPS-Devel] OpenSIPS Crash
>
>                 Hi Ben,
>
>                 According to the backtrace, the memory debugger was
>                 not activated. Do an "opensips -V" to check the
>                 resulting compile flags - do you see the DBG_MALLOC
>                 and QM_MALLOC ?
>
>                 Regards,
>
>
>
>
>                 Bogdan-Andrei Iancu
>
>                   
>
>                 OpenSIPS Founder and Developer
>
>                    http://www.opensips-solutions.com
>
>                 OpenSIPS Bootcamp 2018
>
>                    http://opensips.org/training/OpenSIPS_Bootcamp_2018/
>
>                 On 10/31/2018 05:04 PM, Ben Newlin wrote:
>
>                     Bogdan,
>
>                     I was able to compile with those options and the
>                     crash has occurred again. Backtrace is here:
>                     https://pastebin.com/dezi9xUU
>                     <https://pastebin.com/dezi9xUU>.
>
>                     Even though I had `memdump=1` set in my script,
>                     there was no extra memory debugging information in
>                     the logs prior to or at the time of the crash. I’m
>                     not sure if that is expected or not.
>
>                     Ben Newlin
>
>                     *From: *Bogdan-Andrei Iancu <bogdan at opensips.org>
>                     <mailto:bogdan at opensips.org>
>                     *Date: *Monday, October 29, 2018 at 8:11 AM
>                     *To: *Ben Newlin <Ben.Newlin at genesys.com>
>                     <mailto:Ben.Newlin at genesys.com>, OpenSIPS devel
>                     mailling list <devel at lists.opensips.org>
>                     <mailto:devel at lists.opensips.org>
>                     *Subject: *Re: [OpenSIPS-Devel] OpenSIPS Crash
>
>                     Hi Ben,
>
>                     You can change the compile flags via the
>                     Makefile.conf file - the menuconfig is also
>                     updating that file. So during your build you can
>                     simply push a pre-modified Makefile.conf file with
>                     the options needed for memory debugging.
>
>                     Regards,
>
>
>
>
>
>                     Bogdan-Andrei Iancu
>
>                       
>
>                     OpenSIPS Founder and Developer
>
>                        http://www.opensips-solutions.com
>
>                     OpenSIPS Bootcamp 2018
>
>                        http://opensips.org/training/OpenSIPS_Bootcamp_2018/
>
>                     On 10/26/2018 05:14 PM, Ben Newlin wrote:
>
>                         Bogdan,
>
>                         Unfortunately, we have run into a similar
>                         issue before. Our build system is completely
>                         automated and there is no way to inject the
>                         `make menuconfig` interactive step into that
>                         process. If I were to be testing this locally
>                         I might be able to work something out, but I
>                         could never get such a build into our testing
>                         environment which is where the crashes are
>                         occurring.
>
>                         Do you have instructions for enabling memory
>                         debugging that do not require using the
>                         interactive TUI tool? What does the menuconfig
>                         program do when these options are selected?
>                         Are there some defines or other settings we
>                         can change ourselves and bypass menuconfig?
>
>                         Ben Newlin
>
>                         *From: *Bogdan-Andrei Iancu
>                         <bogdan at opensips.org> <mailto:bogdan at opensips.org>
>                         *Date: *Friday, October 26, 2018 at 4:59 AM
>                         *To: *OpenSIPS devel mailling list
>                         <devel at lists.opensips.org>
>                         <mailto:devel at lists.opensips.org>, Ben Newlin
>                         <Ben.Newlin at genesys.com>
>                         <mailto:Ben.Newlin at genesys.com>
>                         *Subject: *Re: [OpenSIPS-Devel] OpenSIPS Crash
>
>                         Hi Ben,
>
>                         all the BT's points to crashes while doing
>                         memory ops. I suspect a memory corruption that
>                         randomly triggers crashes in different parts
>                         of the code.
>
>                         Could you try to re-compile with memory
>                         debugging support ? See
>                         http://www.opensips.org/Documentation/TroubleShooting-OutOfMem,
>                         the "How to handle it" section.
>
>                         Regards,
>
>
>
>
>
>
>                         Bogdan-Andrei Iancu
>
>                           
>
>                         OpenSIPS Founder and Developer
>
>                            http://www.opensips-solutions.com
>
>                         OpenSIPS Bootcamp 2018
>
>                            http://opensips.org/training/OpenSIPS_Bootcamp_2018/
>
>                         On 10/24/2018 04:28 AM, Ben Newlin wrote:
>
>                             We have had 2 more crashes today.
>
>                             Crash 2: https://pastebin.com/rMruBQcZ
>                             <https://pastebin.com/rMruBQcZ>
>
>                             This crash appears to have occurred while
>                             processing an initial INVITE request. I
>                             could not see anything unusual about the
>                             request. I cannot tell if this crash is
>                             related to the others.
>
>                             Crash 3: https://pastebin.com/Gmk1m4NT
>                             <https://pastebin.com/Gmk1m4NT>
>
>                             This crash follows the pattern of the
>                             original crash I reported.
>
>                             Ben Newlin
>
>                             *From: *Devel
>                             <devel-bounces at lists.opensips.org>
>                             <mailto:devel-bounces at lists.opensips.org>
>                             on behalf of Ben Newlin
>                             <Ben.Newlin at genesys.com>
>                             <mailto:Ben.Newlin at genesys.com>
>                             *Reply-To: *OpenSIPS devel mailling list
>                             <devel at lists.opensips.org>
>                             <mailto:devel at lists.opensips.org>
>                             *Date: *Monday, October 22, 2018 at 4:45 PM
>                             *To: *OpenSIPS devel mailling list
>                             <devel at lists.opensips.org>
>                             <mailto:devel at lists.opensips.org>
>                             *Subject: *Re: [OpenSIPS-Devel] OpenSIPS Crash
>
>                             Here is a better trace of the call:
>                             https://pastebin.com/gWpQR8E7
>                             <https://pastebin.com/gWpQR8E7>
>
>                             Ben Newlin
>
>                             *From: *Ben Newlin
>                             <Ben.Newlin at genesys.com>
>                             <mailto:Ben.Newlin at genesys.com>
>                             *Date: *Monday, October 22, 2018 at 4:34 PM
>                             *To: *OpenSIPS devel mailling list
>                             <devel at lists.opensips.org>
>                             <mailto:devel at lists.opensips.org>
>                             *Subject: *OpenSIPS Crash
>
>                             Hello,
>
>                             We have been having sporadic crashes and I
>                             was recently able to recover a core dump
>                             for one. I have uploaded it here:
>                             https://pastebin.com/ABktcYcH
>                             <https://pastebin.com/ABktcYcH>.
>
>                             I picked out a Call-ID from the crash data
>                             and took a look in our tracing. I have
>                             uploaded it here:
>                             https://pastebin.com/ZEzUUKZ5
>                             <https://pastebin.com/ZEzUUKZ5>.
>
>                             It appears that a downstream server was
>                             extremely lagged and failed to respond to
>                             an INVITE. We sent the INVITE to another
>                             server and the call was connected, but
>                             then eventually the original server
>                             “caught up” and sent a burst of 200 OK
>                             responses. The crash seems to have
>                             occurred processing the ACK to one of
>                             these responses.
>
>                             Ben Newlin
>
>
>
>
>
>
>
>
>
>                             _______________________________________________
>
>                             Devel mailing list
>
>                             Devel at lists.opensips.org
>                             <mailto:Devel at lists.opensips.org>
>
>                             http://lists.opensips.org/cgi-bin/mailman/listinfo/devel
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>             _______________________________________________
>
>             Devel mailing list
>
>             Devel at lists.opensips.org <mailto:Devel at lists.opensips.org>
>
>             http://lists.opensips.org/cgi-bin/mailman/listinfo/devel
>
>
>
>
>
>
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/devel/attachments/20181115/0716c1ed/attachment-0001.html>


More information about the Devel mailing list