[OpenSIPS-Devel] [ opensips-Bugs-2940556 ] Timer bug in TM handler

SourceForge.net noreply at sourceforge.net
Tue Feb 9 18:30:13 CET 2010


Bugs item #2940556, was opened at 2010-01-27 00:34
Message generated for change (Comment added) made by apogrebennyk
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2940556&group_id=232389

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: modules
Group: 1.5.x
Status: Open
Resolution: Accepted
Priority: 5
Private: No
Submitted By: Andrew Pogrebennyk (apogrebennyk)
Assigned to: Bogdan-Andrei Iancu (bogdan_iancu)
Summary: Timer bug in TM handler

Initial Comment:
I have got stuck trying to implement a sort of serial forking (forwarding) scenario with OpenSIPS. The thing is that when fr_inv_timer hits, OpenSIPS 
(prematurely) sends INVITE on the next branch and only after that CANCELs the previous one. (And if the gateway receives different branch on transaction to which no final reply has been sent yet - it can merge the requests.) It can be clearly seen in the attached trace that contains messages from the OpenSIPS to the b2bua on the same machine port 5061. I've found this behavior to be consistent in the versions 1.3.2 - 1.5.3, but the logs that follow are from OpenSER 1.3.2. Similar bug has been reported for Kamailio: http://lists.kamailio.org/pipermail/devel/2009-May/018982.html

Another thing I've found is that OpenSIPS resets the fr_timer in retransmission_handler() if no provisional response to INVITE has been received. So if the fr_timer hits OpenSIPS silently fails over to the next forwarding destination without sending CANCEL to this branch (again 1.5.3 is affected). Here we see that it forwards the INVITE and sets FR_TIMER as per script:

Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: Request leaving server, D-URI='<null>' - M=INVITE RURI=sip:40084956288253 at 195.128.80.211:5061 F=sip:000100 at sip2.d-tel.ru T=sip:000101 at sip2.d-tel.ru IP=77.122.227.73 ID=YmM1NGE4NWQ1ZWZhZjVjNDBkNzljODZjYTk5MDVlN2Q.
[...skipped...]
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:_set_fr_retr: FR_TIMER = 10
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: relative timeout is 10
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling insert_timer_unsafe: timeout=10, final value=46
...
Here it hits the retransmission_handler() - and this "relative timeout 
is 1000000" messages appears:

Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:retransmission_handler: retransmission_handler : request resending (t=0x28701ad0, INVITE si ... )
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: relative timeout is 1000000
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling insert_timer_unsafe: timeout=1000000, final value=37500000
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:insert_timer_unsafe: [5]: 0x28701d30 (37500000)
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:retransmission_handler: retransmission_handler : done

I think this timeout somehow interferes with fr_timer because when fr_timer hits in 20 seconds, OpenSIPS _silently_ fails over to the next 
forwarding destination without sending CANCEL to _this_ branch. This occurs only when no provisional response has been got so I tend to think 
this is somehow related to the timeout thing.

I have added extra debug print to insert_timer_unsafe() like:
  LM_DBG("calling insert_timer_unsafe: timeout=%lld, final value=%lld\n",timeout,timeout + get_ticks());

so you could see the sequence of timer updates:
Jan 27 05:04:15 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling insert_timer_unsafe: timeout=2000000, final value=39500000
Jan 27 05:04:17 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling insert_timer_unsafe: timeout=4000000, final value=43500000
Jan 27 05:04:21 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling insert_timer_unsafe: timeout=4000000, final value=47500000

Then OpenSIPS fails over to the next forwarding destination on locally generated 408 - but note there is no real "sending cancel..." message 
from cancel_branch() between these messages:

Jan 27 05:04:24 sip2 /usr/local/sbin/openser[48744]: DBG:tm:insert_timer_unsafe: [0]: 0x28701e60 (56)
Jan 27 05:04:24 sip2 /usr/local/sbin/openser[48744]: DBG:tm:relay_reply: branch=1, save=1, relay=-1
Jan 27 05:04:24 sip2 /usr/local/sbin/openser[48744]: DBG:tm:final_response_handler: done

Please check.

----------------------------------------------------------------------

>Comment By: Andrew Pogrebennyk (apogrebennyk)
Date: 2010-02-09 19:30

Message:
Bogdan,

Thank you. Sure I'm willing to test you the first issue.

Regarding the second issue - thanks for clarifying that for me.

Andrew

----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2010-02-09 19:20

Message:
Hi Andrew,

Indeed, the CANCEL is fired after the new branches were sent out - I can
work out a patch to change the order - it is a bit delicate (as everything
related to TM, but I think is doable) - I hope you are willing to test it
for me ;)

Regarding the second issue, that is exactly what RFC says - if you get no
reply at all from a destination, do not attempt to send a CANCEL there.

Regards,
Bogdan

----------------------------------------------------------------------

Comment By: Andrew Pogrebennyk (apogrebennyk)
Date: 2010-01-27 00:39

Message:
Things to note in the ngrep trace: INVITE for branch 1 comes at
05:04:14.173313 - before the CANCEL for branch 0 which comes at
05:04:14.173572.
Then there's INVITE for branch 2 at 05:04:24.274084, but no CANCEL for the
previous branch (fr_timer in action - my b2bua kept retransmitting 183
Session Progress with branch 0, so there was no response for branch 1 at
all).

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2940556&group_id=232389



More information about the Devel mailing list