[OpenSIPS-Users] Serialforking failure, with lcr:parse_phostport: too many colons in udp:: 0
Bogdan-Andrei Iancu
bogdan at voice-system.ro
Wed Oct 13 23:16:32 CEST 2010
Hi Taisto,
Your problem is not timer related or how serial forking is done in
opensips (I will comment on these in a later reply).
Right now, the quick answer to fix your problem: failure route must be
re-armed after each branch -> this is why your failure route does not
catches the end of the second branch. Adding a t_on_failure("1") before
t_relay() in failure route will fix your problem.
Regards,
Bogdan
Taisto Qvist wrote:
> Hi Bogdan,
>
> I've now been trying with some tests, and I cant really get it to work,
> since the transactionlayer on the server transaction returns a 408
> back to the UAC before serial forking has ended.
> This seems a little bit related to what I once commented on a long time
> ago regarding handling of timer C and the fact that the timer c seems to
> be quite "tied" to Timer B
>
> When the fr_timer pops, (causing the CANCEL to be sent so that we can move
> on to the next serial-fork-target), the tm-layer seems store this
> timer-pop
> as a 408 response
>
> 20:41:44 osips[4686]: DBG:tm:utimer_routine: timer
> routine:4,tl=0xb5b6770c next=(nil), timeout=649300000
> 20:41:55 osips[4686]: DBG:tm:timer_routine: timer
> routine:1,tl=0xb5b67728 next=(nil), timeout=660
> 20:41:55 osips[4686]: DBG:tm:final_response_handler: stop retr. and
> send CANCEL (0xb5b675c0)
> 20:41:55 osips[4686]: DBG:tm:t_should_relay_response: T_code=180,
> new_code=408
> 20:41:55 osips[4686]: DBG:tm:t_pick_branch: picked branch 0, code 408
> (prio=800)
>
> As the capture and log I've attached indicates, I am not able to
> perform a three
> step serial fork. I have three Uas:es registered with 1.0, 0.9, and
> 0.8 in q-values.
>
> First timer pop causes a CANCEL, and a new INVITE towards UAS with
> q=0.9, but when
> it pops the second time, TM still cancels the second target, but
> instead of continuing
> with the third, it sends a 408 towards the UAC.
>
> It might be something with my script-handling in the failure_route, so
> here it is:
>
> failure_route[1]
> {
> if ( t_was_cancelled() )
> {
> log(2, "transaction was cancelled by UAC\n");
> }
> xlog("(lab1) - In FailureRoute: branches=$(branch(uri)[*])\n");
> if ( isflagset(1) )
> {
> log(2,"(lab1) - 3++ Received, attempting serial fork!\n");
> next_branches();
> switch ( $retcode )
> {
> case 1:
> log(2,"(lab1) - More branches left, rollOver timer
> set.");
> $avp(s:timerC) = 12;
> setflag(1); # Do I need this? Should I use
> branchflags instead?
> break;
> case 2:
> log(2,"(lab1) - Last branch, timerC set to 60 sec");
> $avp(s:timerC) = 60;
> break;
>
> default:
> log(2,"(lab1) - No more serial fork targets.");
> exit;
> }
> if ( !t_relay() )
> {
> log(2,"(lab1) - Error during relay for serial fork!\n");
> }
> }
> else
> {
> log(2,"(lab1) - 3++ result. Serialforking not available.\n");
> }
>
> }
>
> When I say that it seems related to another issue I commented on a
> long time
> ago, I am referring to the general handling of Timer C, which doesn't
> seem to
> be a separate timer, but is reusing the timerB.
>
> When the timer pops after the normal 180 seconds, the TM layer will
> *instantly*
> generate a 408 response on the server txn, while at the same time
> generating
> the CANCEL attempting to terminate the client txn.
> To me, this is wrong, but maybe I am suppose to handle this in the
> failure_route?
>
> What I would expect is that the CANCEL will cause a 487 response from
> the UAS,
> and this will be the final response sent to the UAC.
> Also by behaving this way, we may cause a protocol violation even
> though the risk
> is small.
>
> Once timer C pops we send the CANCEL hoping that it will cause a 487.
> BUT, it is
> quite possible that before the cancel is received by the UAS, it sends
> a 200 to
> the INVITE! Even IF the CANCEL receives a 2xx response, we may still
> get a 2xx
> response to the INVITE.
> But with the current behavior of opensips, this would cause opensips
> to proxy
> TWO final responses on the server txn, once being the initial 408 sent
> by the
> txn on timer C timeout, and then the real/actual 2xx sent by the uas.
>
> I've also seen a similar problem with 6xx responses received on a
> branch during
> forking.
> Opensips forwards the 6xx *before* the remaining client txns has
> completed, and
> there is no guarantee that these client txns will all terminate with
> 487 even
> if opensips tries to CANCEL all of them asap.
> They may still return 2xx to the invite, which would cause a
> forwarding of both
> a 6xx and a 2xx on the server txn. This scenario is even mentioned in
> rfc3261.
>
> So all these three problems have in common that the server txn seems
> to be
> terminating a bit early, before the client side has fully completed,
> but as
> I said, it might at least partially be something I should handle in my
> failure_routes...?
>
> Thanks for all your help.
> Regards
> Taisto Qvist
>
>
> Bogdan-Andrei Iancu skrev 2010-10-06 17:04:
>> Hi Taisto,
>>
>> could you test the rev 7248 on trunk for solution 2) ? if ok, I will
>> backport to 1.6
>>
>> Regards,
>> Bogdan
--
Bogdan-Andrei Iancu
OpenSIPS Bootcamp
15 - 19 November 2010, Edison, New Jersey, USA
www.voice-system.ro
More information about the Users
mailing list