[OpenSIPS-Devel] [ opensips-Bugs-2940556 ] Timer bug in TM handler

SourceForge.net noreply at sourceforge.net
Tue Jan 26 23:34:36 CET 2010


Bugs item #2940556, was opened at 2010-01-27 00:34
Message generated for change (Tracker Item Submitted) made by apogrebennyk
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2940556&group_id=232389

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: modules
Group: 1.5.x
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Andrew Pogrebennyk (apogrebennyk)
Assigned to: Nobody/Anonymous (nobody)
Summary: Timer bug in TM handler

Initial Comment:
I have got stuck trying to implement a sort of serial forking (forwarding) scenario with OpenSIPS. The thing is that when fr_inv_timer hits, OpenSIPS 
(prematurely) sends INVITE on the next branch and only after that CANCELs the previous one. (And if the gateway receives different branch on transaction to which no final reply has been sent yet - it can merge the requests.) It can be clearly seen in the attached trace that contains messages from the OpenSIPS to the b2bua on the same machine port 5061. I've found this behavior to be consistent in the versions 1.3.2 - 1.5.3, but the logs that follow are from OpenSER 1.3.2. Similar bug has been reported for Kamailio: http://lists.kamailio.org/pipermail/devel/2009-May/018982.html

Another thing I've found is that OpenSIPS resets the fr_timer in retransmission_handler() if no provisional response to INVITE has been received. So if the fr_timer hits OpenSIPS silently fails over to the next forwarding destination without sending CANCEL to this branch (again 1.5.3 is affected). Here we see that it forwards the INVITE and sets FR_TIMER as per script:

Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: Request leaving server, D-URI='<null>' - M=INVITE RURI=sip:40084956288253 at 195.128.80.211:5061 F=sip:000100 at sip2.d-tel.ru T=sip:000101 at sip2.d-tel.ru IP=77.122.227.73 ID=YmM1NGE4NWQ1ZWZhZjVjNDBkNzljODZjYTk5MDVlN2Q.
[...skipped...]
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:_set_fr_retr: FR_TIMER = 10
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: relative timeout is 10
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling insert_timer_unsafe: timeout=10, final value=46
...
Here it hits the retransmission_handler() - and this "relative timeout 
is 1000000" messages appears:

Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:retransmission_handler: retransmission_handler : request resending (t=0x28701ad0, INVITE si ... )
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: relative timeout is 1000000
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling insert_timer_unsafe: timeout=1000000, final value=37500000
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:insert_timer_unsafe: [5]: 0x28701d30 (37500000)
Jan 27 05:04:14 sip2 /usr/local/sbin/openser[48744]: DBG:tm:retransmission_handler: retransmission_handler : done

I think this timeout somehow interferes with fr_timer because when fr_timer hits in 20 seconds, OpenSIPS _silently_ fails over to the next 
forwarding destination without sending CANCEL to _this_ branch. This occurs only when no provisional response has been got so I tend to think 
this is somehow related to the timeout thing.

I have added extra debug print to insert_timer_unsafe() like:
  LM_DBG("calling insert_timer_unsafe: timeout=%lld, final value=%lld\n",timeout,timeout + get_ticks());

so you could see the sequence of timer updates:
Jan 27 05:04:15 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling insert_timer_unsafe: timeout=2000000, final value=39500000
Jan 27 05:04:17 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling insert_timer_unsafe: timeout=4000000, final value=43500000
Jan 27 05:04:21 sip2 /usr/local/sbin/openser[48744]: DBG:tm:set_timer: calling insert_timer_unsafe: timeout=4000000, final value=47500000

Then OpenSIPS fails over to the next forwarding destination on locally generated 408 - but note there is no real "sending cancel..." message 
from cancel_branch() between these messages:

Jan 27 05:04:24 sip2 /usr/local/sbin/openser[48744]: DBG:tm:insert_timer_unsafe: [0]: 0x28701e60 (56)
Jan 27 05:04:24 sip2 /usr/local/sbin/openser[48744]: DBG:tm:relay_reply: branch=1, save=1, relay=-1
Jan 27 05:04:24 sip2 /usr/local/sbin/openser[48744]: DBG:tm:final_response_handler: done

Please check.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2940556&group_id=232389



More information about the Devel mailing list