[OpenSIPS-Devel] Erroneous Retransmissions

Ben Newlin Ben.Newlin at genesys.com
Tue Jul 2 20:17:58 EDT 2019


Hello,

This issue has not gotten a lot of response, but after upgrading to the latest 2.4 branch, we are still seeing cases where OpenSIPS retransmits a message even though a response has been received and processed already. In this case the retransmission occurred across a second boundary, so the timestamps in OpenSIPS' logs clearly show the retransmission was after successful processing of the first reply.

Latest example here: https://pastebin.com/9R47gN4W

In this example, you can see process 333 receives a REFER message at 19:28:51 and forwards it on to the next server. Following that process 331 receives a 405 response, also at 19:28:51. The response is processed and relayed back to the originator. But then at 19:28:52, the retransmission timer kicks anyway and process 327 retransmits the REFER. This causes another 405 response which is received by process 330 and relayed again.

I have logs, packet captures, and anything else necessary to debug this issue. I can reproduce it very easily.

Ben Newlin 

On 5/20/19, 8:50 AM, "Devel on behalf of Ben Newlin" <devel-bounces at lists.opensips.org on behalf of Ben.Newlin at genesys.com> wrote:

    Razvan,
    
    I realize the microsecond stamp is not on each line, but I don't think that means this trace is not proving the issue. You can clearly see that the time between the last ms stamp before the timer is set (2019-05-17T13:48:10.440789Z) to the next stamp after the retransmission has occurred (2019-05-17T13:48:10.557105Z) is only 117ms, and this includes a lot of processing time after the retransmission was sent. I think that pretty clearly shows the issue.
    
    I would love to get more exact granularity for the logs, but OpenSIPS does not provide that. We are not using syslog; the timestamps on the OpenSIPS logs are coming from OpenSIPS directly, as are the prefixed stamps on my logs. I'm not aware of any way to instruct OpenSIPS to increase the granularity of that timestamp, so I'm not sure how to get you what you want.
    
    In addition, the numerous other traces I have provided clearly show retransmissions being sent after replies for the request have already been processed. No time stamps are necessary to see that; the retransmission logs are printing after the reply processing logs. It is very clear.
    
    Ben Newlin 
    
    On 5/20/19, 3:51 AM, "Devel on behalf of Răzvan Crainea" <devel-bounces at lists.opensips.org on behalf of razvan at opensips.org> wrote:
    
        Hi, Ben!
        
        You are right, we are pretty busy trying to complete the 3.0 release, 
        that's why we are pushing back some of the reports.
        I did check the logs that you have sent, and it doesn't seem to prove 
        the issue, as the microsec timestamp is not logged by each line. It 
        would be more helpful if you could also configure syslog to log the 
        microseconds.
        
        Best regards,
        Răzvan
        
        On 5/17/19 9:20 PM, Ben Newlin wrote:
        > Hello,
        > 
        > I know you guys are probably heads down trying to get 3.0 out the door. 
        > I’m hoping that’s why this issue hasn’t gotten any traction. However, 
        > I’m continuing to see poor retransmission behavior from OpenSIPS. Traces 
        > from the most recent example are here: https://pastebin.com/4Lam9FaZ.
        > 
        > In this trace I’ve left in some of our proprietary tracing that show a 
        > more detailed timestamp. OpenSIPS is receiving a BYE request and sets a 
        > retransmission timer for 500ms, the standard time. However, the timer 
        > popped after only ~124ms and a retransmission of the BYE was triggered. 
        > I have tcpdump traces confirming the retransmission of the BYE actually 
        > occurred at 124ms, this is not just an issue internal to OpenSIPS.
        > 
        > Any insight our troubleshooting assistance with these issues would be 
        > greatly appreciated. This issue reproduces very regularly.
        > 
        > Ben Newlin
        > 
        > *From: *Devel <devel-bounces at lists.opensips.org> on behalf of Ben Newlin 
        > <Ben.Newlin at genesys.com>
        > *Reply-To: *OpenSIPS devel mailling list <devel at lists.opensips.org>
        > *Date: *Wednesday, May 8, 2019 at 3:13 PM
        > *To: *OpenSIPS devel mailling list <devel at lists.opensips.org>
        > *Subject: *Re: [OpenSIPS-Devel] Erroneous Retransmissions
        > 
        > Has anyone had a chance to look into this issue? I was able to mitigate 
        > by inserting small delays in my test clients, but now I am running into 
        > an issue where responses from one OpenSIPS to another are being sent 
        > quickly enough to cause this problem. Making changes in my testbed is 
        > one thing, but I really don’t want to start injecting delays into my 
        > processing script to work around this issue.
        > 
        > The issue is that final responses to a transaction that are received 
        > within a few milliseconds of the transaction being sent are not stopping 
        > the retransmission timer. The response is handled properly by OpenSIPS, 
        > but approximately 500ms later OpenSIPS will retransmit the request 
        > anyway. I can reproduce it very regularly in our testbed and it should 
        > also be easily reproducible with a simple SIPp scenario.
        > 
        > Trace from the latest occurrence is here: https://pastebin.com/YAXFwNLb
        > 
        > I’d be happy to open an issue on Github as well if that is desired.
        > 
        > Ben Newlin
        > 
        > *From: *Devel <devel-bounces at lists.opensips.org> on behalf of Ben Newlin 
        > <Ben.Newlin at genesys.com>
        > *Reply-To: *OpenSIPS devel mailling list <devel at lists.opensips.org>
        > *Date: *Thursday, April 25, 2019 at 11:46 AM
        > *To: *OpenSIPS devel mailling list <devel at lists.opensips.org>
        > *Subject: *Re: [OpenSIPS-Devel] Erroneous Retransmissions
        > 
        > I have found that this issue seems to only occur when the responses are 
        > received immediately after the request has been sent; within a few 
        > milliseconds. Inserting a delay in the response seems to have made the 
        > erroneous retransmissions stop. This does not make it any less of a bug, 
        > but does seem to be a workaround.
        > 
        > Ben Newlin
        > 
        > *From: *Devel <devel-bounces at lists.opensips.org> on behalf of Ben Newlin 
        > <Ben.Newlin at genesys.com>
        > *Reply-To: *OpenSIPS devel mailling list <devel at lists.opensips.org>
        > *Date: *Friday, April 19, 2019 at 8:47 AM
        > *To: *OpenSIPS devel mailling list <devel at lists.opensips.org>
        > *Subject: *[OpenSIPS-Devel] Erroneous Retransmissions
        > 
        > Resending the issue described below to the devel mailing list as I 
        > believe this is a bug, not a configuration issue.
        > 
        > Ben Newlin
        > 
        > *From: *Users <users-bounces at lists.opensips.org> on behalf of Ben Newlin 
        > <Ben.Newlin at genesys.com>
        > *Reply-To: *OpenSIPS users mailling list <users at lists.opensips.org>
        > *Date: *Monday, April 15, 2019 at 7:20 PM
        > *To: *OpenSIPS users mailling list <users at lists.opensips.org>
        > *Subject: *Re: [OpenSIPS-Users] Erroneous Retransmissions
        > 
        > Just wanted to bump this to see if anyone has had a chance to take a 
        > look. We are continuing to see this failing our tests intermittently. I 
        > have a new example here: https://pastebin.com/kQJSGE56.
        > 
        > In this example you can see that the retransmission of the INVITE is 
        > occurring not just after the 100 Trying has been received, but also 
        > after a 500 final response has already been received and is being 
        > processed, but the INVITE is still retransmitted!
        > 
        > Any help on this would be greatly appreciated.
        > 
        > # opensips -V
        > 
        > version: opensips 2.4.5 (x86_64/linux)
        > 
        > flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC, 
        > F_MALLOC, FAST_LOCK-ADAPTIVE_WAIT
        > 
        > ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, 
        > MAX_URI_SIZE 1024, BUF_SIZE 65535
        > 
        > poll method support: poll, epoll, sigio_rt, select.
        > 
        > git revision: d025b4f61
        > 
        > main.c compiled on 21:18:18 Apr 12 2019 with gcc 7
        > 
        > Ben Newlin
        > 
        > *From: *Users <users-bounces at lists.opensips.org> on behalf of Ben Newlin 
        > <Ben.Newlin at genesys.com>
        > *Reply-To: *OpenSIPS users mailling list <users at lists.opensips.org>
        > *Date: *Thursday, April 4, 2019 at 3:10 PM
        > *To: *OpenSIPS users mailling list <users at lists.opensips.org>
        > *Subject: *[OpenSIPS-Users] Erroneous Retransmissions
        > 
        > Hello,
        > 
        > We are seeing an issue which is causing our tests to fail 
        > intermittently. It appears that sometimes OpenSIPS will retransmit a 
        > relayed INVITE request, even though a 100 Trying response has already 
        > been received and processed. This only happens intermittently.
        > 
        > I have captured debug logs for the scenario: 
        > https://pastebin.com/bHcBJmtv. In the log, you can see the 100 Trying 
        > response is received on line 707 and it appears to be matched to the 
        > correct transaction on line 729. However then on line 774 you can see 
        > the retransmission of the INVITE anyway.
        > 
        > I have verified using tcpdump that the INVITE is being retransmitted; it 
        > is not just a log error. And our tests are failing due to the unexpected 
        > extra 100 Trying response to the retransmission.
        > 
        > Can anyone explain why this would occur?
        > 
        > Ben Newlin
        > 
        > 
        > _______________________________________________
        > Devel mailing list
        > Devel at lists.opensips.org
        > http://lists.opensips.org/cgi-bin/mailman/listinfo/devel
        > 
        
        -- 
        Răzvan Crainea
        OpenSIPS Core Developer
           http://www.opensips-solutions.com
        Meet the OpenSIPS team at the next OpenSIPS Summit:
           https://www.opensips.org/events
        
        _______________________________________________
        Devel mailing list
        Devel at lists.opensips.org
        http://lists.opensips.org/cgi-bin/mailman/listinfo/devel
        
        
    
    _______________________________________________
    Devel mailing list
    Devel at lists.opensips.org
    http://lists.opensips.org/cgi-bin/mailman/listinfo/devel
    



More information about the Devel mailing list