[OpenSIPS-Users] OpenSIPS fix_route_dialog crashes

Bogdan-Andrei Iancu bogdan at opensips.org
Mon Aug 1 16:57:32 CEST 2016


Hi Ben,

According to the BT,the crash is in a pkg_malloc() call:
                     route = pkg_malloc(size);
Please double check this with gdb info.

Ifso, this indicate a memory corruptionand we have 2 options here:
     - you compile with memory debugger (see my previous emails)
     - provide step-by-step indications on how to reproduce this crash.

Thanks and Regards,

Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com

On 29.07.2016 15:54, Newlin, Ben wrote:
>
> This is 1.11.6, running on CentOS 7.
>
> Ben Newlin
>
> *From: *<users-bounces at lists.opensips.org> on behalf of Bogdan-Andrei 
> Iancu <bogdan at opensips.org>
> *Reply-To: *OpenSIPS users mailling list <users at lists.opensips.org>
> *Date: *Friday, July 29, 2016 at 8:50 AM
> *To: *"Newlin, Ben" <Ben.Newlin at inin.com>, OpenSIPS users mailling 
> list <users at lists.opensips.org>
> *Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog crashes
>
> Ben,
>
> What OpenSIPS version is this (the crashing one) ? 1.11 or 2.1 ?
>
> Regards,
>
> Bogdan-Andrei Iancu
> OpenSIPS Founder and Developer
> http://www.opensips-solutions.com
>
> On 27.07.2016 19:02, Newlin, Ben wrote:
>
>     I have identified that these crashes are occurring when the far
>     end system is not returning the Record-Route headers in the 200 OK
>     response. The headers are present in the 180 response, but not the
>     200 OK. I have reproduced the scenario using SIPp and captured a
>     SIP trace: http://pastebin.com/ckKk3EhY <http://pastebin.com/ckKk3EhY>
>
>     The crash occurs on receipt of the ACK request and attempt to
>     match the dialog.
>
>     I also captured a BT for this scenario as well, in case anything
>     specific in the trace made the issue easier to find:
>     http://pastebin.com/cM3FhPiw
>
>     I am working with the other system to try to fix their behavior.
>
>     Ideally the Record-Route headers from previous replies could be
>     used in this case to allow the call to succeed, but I don’t know
>     if that is possible.
>
>     Thanks,
>
>     Ben Newlin
>
>     *From: *"Newlin, Ben" <Ben.Newlin at inin.com>
>     <mailto:Ben.Newlin at inin.com>
>     *Date: *Wednesday, July 27, 2016 at 9:44 AM
>     *To: *Bogdan-Andrei Iancu <bogdan at opensips.org>
>     <mailto:bogdan at opensips.org>, OpenSIPS users mailling list
>     <users at lists.opensips.org> <mailto:users at lists.opensips.org>
>     *Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog crashes
>
>     Bogdan,
>
>     This is a different scenario than the other you responded to. As I
>     said, we have two types of servers that work together. One is a
>     load-balancer and runs as a proxy. It uses double Record-Route
>     because it sends messages between public and private networks.
>     Then we have our other servers using TH which receive those
>     requests. We are not using TH and RR on the same server (although
>     I would like to).
>
>     If validate_dialog() and fix_route_dialog() (and possibly
>     loose_route()) should not be called when using TH, I believe the
>     documentation should reference that. It states that match_dialog()
>     must be used with TH, but does not indicate that the other
>     functions should not be used or that the functionality won’t work.
>     There is also no documentation of the incompatibility between RR
>     and TH.
>
>     Either way, I ran a test where I removed all calls to
>     loose_route(), validate_dialog(), and fix_route_dialog() from my
>     script. The crash still occurred and the BT still pointed to
>     fix_route_dialog() function. So it must be getting called from
>     within Dialog module somewhere. That BT is here:
>     http://pastebin.com/wu2X2Hxh
>
>     I collected this BT with loose_route() being called from my
>     script, but not validate_dialog() or fix_route_dialog():
>     http://pastebin.com/6V7yPaHF
>
>     This BT was collected with all three functions being called from
>     my script: http://pastebin.com/fZYYdndn
>
>     Ben Newlin
>
>     *From: *Bogdan-Andrei Iancu <bogdan at opensips.org>
>     <mailto:bogdan at opensips.org>
>     *Date: *Wednesday, July 27, 2016 at 3:57 AM
>     *To: *OpenSIPS users mailling list <users at lists.opensips.org>
>     <mailto:users at lists.opensips.org>, "Newlin, Ben"
>     <Ben.Newlin at inin.com> <mailto:Ben.Newlin at inin.com>
>     *Subject: *Re: [OpenSIPS-Users] OpenSIPS fix_route_dialog crashes
>
>     Hi Ben,
>
>     First, if you use TH, makes no sense to do Record-Routing - there
>     are 2 SIP concepts that overlaps. You either act as an end-point
>     (by doing TH), either as a proxy (doing RR).
>
>     If doing TH, makes no sense to use validate + fix as these
>     functions check and repair the routing information in the request
>     (like Route and Contact headers). if you do TH, this routing info
>     is actually hidden and added by OpenSIPS, so there is nothing to
>     fix and repair.
>
>     Nevertheless, this should not crash or corrupt OpenSIPS. HAve you
>     managed to get a corefile ?
>
>     Also if you suspect memory corruption, you can compile-in the
>     memory debugger - see
>     http://www.opensips.org/Documentation/TroubleShooting-OutOfMem .
>
>     Regards,
>
>
>
>     Bogdan-Andrei Iancu
>
>     OpenSIPS Founder and Developer
>
>     http://www.opensips-solutions.com
>
>     On 26.07.2016 23:20, Newlin, Ben wrote:
>
>         I have had 3 OpenSIPS server crashes in the last week. All
>         were due to segmentation faults. I was not able to capture
>         core dumps; I am configuring that now to catch the next crash.
>
>         My logs leading up to the crash are full of errors from
>         fix_route_dialog() complaining about invalid URIs for
>         sequential requests:
>
>         Jul 26 19:34:02 [220] ERROR:dialog:fix_route_dialog: Failed to
>         parse SIP uri
>
>         Jul 26 19:34:02 [220] ERROR:core:parse_uri: bad uri, state 0
>         parsed: <ip:1> (4) /
>         <ip:10.18.8.18:5060;ftag=gK0448f137;lr;r2=on>> (44)
>
>         Jul 26 19:11:19 [218] ERROR:dialog:fix_route_dialog: Failed to
>         parse SIP uri
>
>         Jul 26 19:11:19 [218] ERROR:core:parse_uri: bad uri, state 0
>         parsed: <b0i2> (4) /
>         <b0i2yjor;transport=udp<sip:10.18.8.17:5060;ftag=7207ce89;lr;r2=on>
>         (65)
>
>         Jul 26 17:43:19 [220] ERROR:dialog:fix_route_dialog: Failed to
>         parse SIP uri
>
>         Jul 26 17:43:19 [220] ERROR:core:parse_uri: bad uri, state 0
>         parsed: <ervi> (4) /
>         <ervice_id6fdbc70f-2438-4726-807c-0d081df4d87> (44)
>
>         Many times the “URI” displayed in the error message is
>         actually internal OpenSIPS variables, as in the last error
>         above. When they are from the SIP message, I have verified
>         that the messages themselves are correctly formatted. This
>         leads me to believe there is memory corruption occurring.
>
>         This all started when I updated my load-balancer servers to
>         use Record-Routing, specifically the “double_rr” mechanism for
>         when multiple interfaces exist. The Record-Routing is
>         occurring on different servers which have not crashed. Only
>         the servers receiving the Record-Routed messages are
>         experiencing the errors.
>
>         Here is a piece of the code processing sequential requests. I
>         am using the topology_hiding() functionality of the Dialog
>         module. Are validate_dialog() and fix_route_dialog() still
>         valid in a topology_hiding scenario?
>
>         if (t_check_trans())
>
>         setflag(SEQ_REQUEST);
>
>           if (has_totag())
>
>           {
>
>             loose_route();
>
>             if (match_dialog())
>
>             {
>
>               if (!validate_dialog())
>
>         fix_route_dialog();
>
>               if (is_method("BYE"))
>
>         setflag(ACC_FLAG);
>
>         setflag(SEQ_REQUEST);
>
>             }
>
>             else if (!isflagset(SEQ_REQUEST))
>
>             {
>
>               if (!is_method("ACK")) {
>
>                 route(rlog, LV_ERROR, "check_sequential", "Sequential
>         request not matched");
>
>           route(reply_error, "481", "Call Does Not Exist");
>
>               }
>
>               return(EXIT);
>
>             }
>
>           }
>
>         I will attempt to get core dumps of future crashes.
>
>         Thanks,
>
>         Ben Newlin
>
>
>
>
>
>
>         _______________________________________________
>
>         Users mailing list
>
>         Users at lists.opensips.org <mailto:Users at lists.opensips.org>
>
>         http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20160801/173e2fb1/attachment-0001.htm>


More information about the Users mailing list