[OpenSIPS-Users] Dialog replication problems

Dawid Mielnik dawid.mielnik at gmail.com
Thu Nov 3 12:02:20 CET 2016


Hi Liviu,

Yes - big difference with the patch :

active server:

dialog::  hash=3426:1456403041 dialog_id=14716014359137
state:: 4
user_flags:: 0
timestart:: 1478170024
datestart:: 2016-11-03 11:47:04
timeout:: 1478191625
dateout:: 2016-11-03 17:47:05
callid:: 81140NWQxYTg1ZWRkZDNiYzE2OGExYzI1NmE1N2Y4MjM2MTE
...

standby server:

dialog::  hash=3426:1456403041 dialog_id=14716014359137
state:: 4
user_flags:: 0
timestart:: 1478170024
datestart:: 2016-11-03 11:47:04
timeout:: 1478191625
dateout:: 2016-11-03 17:47:05
callid:: 81140NWQxYTg1ZWRkZDNiYzE2OGExYzI1NmE1N2Y4MjM2MTE
...

BYE after switch-over:

Nov  3 11:48:55.964155 DEB 29249  DBG:core:parse_msg: SIP Request:
Nov  3 11:48:55.964185 DEB 29249  DBG:core:parse_msg:  method:  <BYE>
Nov  3 11:48:55.964191 DEB 29249  DBG:core:parse_msg:  uri:     <sip:
XXX.XXX.XXX.250:5061;did=26d.162fec65>
Nov  3 11:48:55.964197 DEB 29249  DBG:core:parse_msg:  version: <SIP/2.0>
...
Nov  3 11:48:55.964395 DEB 29249  DBG:dialog:w_match_dialog: We found DID
param in R-URI with value of 26d.162fec65
Nov  3 11:48:55.964399 DEB 29249  DBG:dialog:dlg_onroute: route param is
'26d.162fec65' (len=12)
Nov  3 11:48:55.964437 DEB 29249  DBG:dialog:lookup_dlg: ref dlg
0x7f70c12a7b30 with 1 -> 3
Nov  3 11:48:55.964451 DEB 29249  DBG:dialog:lookup_dlg: dialog
id=1456403041 found on entry 3426
...

Thank you.

Also, should CDR be generated on the standby server after the switch-over ?
I am still not getting those (have to keep my own variables in dialog and
insert in db manually). I am using topology hiding module if that makes any
difference (manual CDR generation is triggered when ( $DLG_status!=NULL &&
!validate_dialog() ) condition fails after which I
call fix_route_dialog())..

BR,
Dawd

On Thu, Nov 3, 2016 at 11:27 AM, Liviu Chircu <liviu at opensips.org> wrote:

> Hi, Dawid!
>
> I have looked into the problem and also managed to come up with a fix!
> Could you please go to your OpenSIPS 2.2 source code directory, apply the
> below patch, recompile the dialog module and see if it fixes the problem?
>
> git apply <(base64 -d <<EOF
> ZGlmZiAtLWdpdCBhL21vZHVsZXMvZGlhbG9nL2RsZ19yZXBsaWNhdGlvbi5j
> IGIvbW9kdWxlcy9k
> aWFsb2cvZGxnX3JlcGxpY2F0aW9uLmMKaW5kZXggYTg0NTVhZi4uZGE2MmRi
> NiAxMDA2NDQKLS0t
> IGEvbW9kdWxlcy9kaWFsb2cvZGxnX3JlcGxpY2F0aW9uLmMKKysrIGIvbW9k
> dWxlcy9kaWFsb2cv
> ZGxnX3JlcGxpY2F0aW9uLmMKQEAgLTE4Miw3ICsxODIsNiBAQCBpbnQgZGxn
> X3JlcGxpY2F0ZWRf
> Y3JlYXRlKHN0cnVjdCBkbGdfY2VsbCAqY2VsbCwgc3RyICpmdGFnLCBzdHIg
> KnR0YWcsIGludCBz
> YWZlKQogCWRsZy0+bGVnc19ub1tETEdfTEVHXzIwME9LXS
> A9IERMR19GSVJTVF9DQUxMRUVfTEVH
> OwogCiAJLyogbGluayB0aGUgZGlhbG9nIGludG8gdGhlIGhhc2ggKi8KLQlk
> bGctPmhfaWQgPSBk
> X2VudHJ5LT5uZXh0X2lkKys7CiAJaWYgKCFkX2VudHJ5LT5maXJzdCkKIAkJ
> ZF9lbnRyeS0+Zmly
> c3QgPSBkX2VudHJ5LT5sYXN0ID0gZGxnOwogCWVsc2Ugewo=
> EOF
> )
>
> Liviu Chircu
> OpenSIPS Developerhttp://www.opensips-solutions.com
>
> On 03.11.2016 11:09, Dawid Mielnik wrote:
>
> Anyone ?
>
> I have just upgraded to the latest 2.2 version form GIT and am still
> experiencing this.
>
> active server:
>
> dialog::  hash=*2297:947327686* dialog_id=9866487206598
> state:: 4
> user_flags:: 0
> timestart:: 1478162278
> datestart:: 2016-11-03 09:37:58
> timeout:: 1478183878
> dateout:: 2016-11-03 15:37:58
> callid:: 81140ODk1MmU1MGM3MzZkZjMyYjIzY2I1ZDExZTI4ZDFiNjY
> from_uri:: sip:+48226522655 at XXX.XXX.XXX.250
> to_uri:: sip:+48501657887 at XXX.XXX.XXX.250
> caller_tag:: 86343576
> caller_contact:: sip:+48226522655 at ZZZ.ZZZ.ZZZ.34:60603;rinstance=
> eab8d8723e64bbad
> callee_cseq:: 0
> caller_route_set::
> caller_bind_addr:: udp:XXX.XXX.XXX.250:5061
> caller_sdp::
> CALLEES::
> callee::
> callee_tag:: 192571
> callee_contact:: sip:YYY.YYY.YYY.20:5060;transport=UDP
> caller_cseq:: 1
> callee_route_set::
> callee_bind_addr:: udp:XXX.XXX.XXX.250:5060
> callee_sdp::
> ...
>
> standby server:
>
> dialog::  hash=*2297:1952115624* dialog_id=9867491994536
> state:: 4
> user_flags:: 0
> timestart:: 1478162278
> datestart:: 2016-11-03 09:37:58
> timeout:: 1478183877
> dateout:: 2016-11-03 15:37:57
> callid:: 81140ODk1MmU1MGM3MzZkZjMyYjIzY2I1ZDExZTI4ZDFiNjY
> from_uri:: sip:+48226522655 at XXX.XXX.XXX.250
> to_uri:: sip:+48501657887 at XXX.XXX.XXX.250
> caller_tag:: 86343576
> caller_contact:: sip:+48226522655 at ZZZ.ZZZ.ZZZ.34:60603;rinstance=
> eab8d8723e64bbad
> callee_cseq:: 0
> caller_route_set::
> caller_bind_addr:: udp:XXX.XXX.XXX.250:5061
> caller_sdp::
> CALLEES::
> callee::
> callee_tag:: 192571
> callee_contact:: sip:YYY.YYY.YYY.20:5060;transport=UDP
> caller_cseq:: 1
> callee_route_set::
> callee_bind_addr:: udp:XXX.XXX.XXX.250:5060
> callee_sdp::
> ...
>
> After switch-over BYE is received on the standby server:
>
> Nov  3 09:44:38.571442 DEB 10180  DBG:core:parse_msg: SIP Request:
> Nov  3 09:44:38.571471 DEB 10180  DBG:core:parse_msg:  method:  <BYE>
> Nov  3 09:44:38.571475 DEB 10180  DBG:core:parse_msg:  uri:
> <sip:XXX.XXX.XXX.250:5061;did=9f8.6c217783>
> Nov  3 09:44:38.571479 DEB 10180  DBG:core:parse_msg:  version: <SIP/2.0>
> ...
> Nov  3 09:44:38.571809 DEB 10180  DBG:dialog:w_match_dialog: We found DID
> param in R-URI with value of 9f8.6c217783
> Nov  3 09:44:38.571812 DEB 10180  DBG:dialog:dlg_onroute: route param is
> '9f8.6c217783' (len=12)
> Nov  3 09:44:38.571814 DEB 10180  DBG:dialog:lookup_dlg: *no dialog
> id=947327686 found on entry 2297*
> Nov  3 09:44:38.571816 DEB 10180  DBG:dialog:dlg_onroute: unable to find
> dialog for BYE with route param '9f8.6c217783'
> ...
>
>
> Is this normal behaviour that dialog hash (and recently added dialog id)
> are different on the replicated server ?
>
> BR,
> Dawid
>
> On Wed, Oct 26, 2016 at 4:43 PM, Dawid Mielnik <dawid.mielnik at gmail.com>
> wrote:
>
>> Hi All,
>>
>> I have a reduntant OpenSIPS 2.2.1 setup with clusterer, binary interface
>> replication and a floating IP. I am encountering a few niuances and am
>> wondering if I am doing something wrong or if there is a bug.
>>
>> 1) Replicated dialog hash id is different on the standby server from the
>> active server
>>
>> active:
>>
>> dialog::  hash=637:902131071
>> state:: 4
>> user_flags:: 0
>> timestart:: 1477413837
>> datestart:: 2016-10-25 18:43:57
>> timeout:: 1477435437
>> dateout:: 2016-10-26 00:43:57
>> callid:: 81140Mzk5ZjViNjY5YzI3MDI5NDMxMDUwZTdlNmQ1MDBhNzg
>> ...
>>
>> standby:
>>
>> dialog::  hash=637:902131072
>> state:: 4
>> user_flags:: 0
>> timestart:: 1477413837
>> datestart:: 2016-10-25 18:43:57
>> timeout:: 1477435438
>> dateout:: 2016-10-26 00:43:58
>> callid:: 81140Mzk5ZjViNjY5YzI3MDI5NDMxMDUwZTdlNmQ1MDBhNzg
>> ...
>>
>> When a switch overoccurs during a dialog, and a request is received on
>> the second server the dialog can not be matched by the DID param and has to
>> fall back to looking for other SIP elements.
>>
>> DBG:dialog:lookup_dlg: no dialog id=902131071 found on entry 637
>> DBG:dialog:dlg_onroute: unable to find dialog for BYE with route param
>> 'd72.f7d65c53'
>>
>> 2) No CDR on the standby server after switch over
>>
>> When a switch over occurs during a dialog CDR is not generated at the end
>> of the call (I have to do it manually). I to not see any run_dlg_callbacks
>> info in debug logs although the replicated dialog seems to have all the acc
>> flags.
>>
>> active:
>>
>> dialog::  hash=637:902131071
>> ...
>> values::
>> accX_table:: acc
>> accX_flags:: \x00\x00\x07\x00\x00\x00\x02\x00
>> accX_db:: \x07\x00\r\x0031.179.202.34\f\x00+48226522655\f\x00+48226522655
>> \x01\x001\f\x00+48501657778\f\x00+48501657778\x02\x0024
>> accX_leg:: \x00\x00\x00\x00
>> accX_core:: \x06\x00INVITE\b\x0004027a21\x01\x0030\x0081140Mzk5ZjViNjY5Y
>> zI3MDI5NDMxMDUwZTdlNmQ1MDBhNzg\x03\x00200\x02\x00OK\x10\x00\
>> xcd\x8b\x0fX\x00\x00\x00\x00]\xb2\x07\x00\x00\x00\x00\x00
>> ...
>> accX_created:: \xcb\x8b\x0fX\x00\x00\x00\x00
>> ...
>> standby:
>>
>> dialog::  hash=637:902131072
>> ...
>> values::
>> accX_created:: \xcb\x8b\x0fX\x00\x00\x00\x00
>> ...
>> accX_core:: \x06\x00INVITE\b\x0004027a21\x01\x0030\x0081140Mzk5ZjViNjY5Y
>> zI3MDI5NDMxMDUwZTdlNmQ1MDBhNzg\x03\x00200\x02\x00OK\x10\x00\
>> xcd\x8b\x0fX\x00\x00\x00\x00]\xb2\x07\x00\x00\x00\x00\x00
>> accX_leg:: \x00\x00\x00\x00
>> accX_db:: \x07\x00\r\x0031.179.202.34\f\x00+48226522655\f\x00+48226522655
>> \x01\x001\f\x00+48501657778\f\x00+48501657778\x02\x0024
>> accX_flags:: \x00\x00\x07\x00\x00\x00\x02\x00
>> accX_table:: acc
>> ...
>>
>> My relevant config:
>> #### DIALOG module
>> loadmodule "dialog.so"
>>
>> modparam("dialog", "dlg_match_mode", 1)
>> modparam("dialog", "default_timeout", 21600)  # 6 hours timeout
>> modparam("dialog", "db_mode", 0)
>> modparam("dialog", "accept_replicated_dialogs", 1)
>> modparam("dialog", "replicate_dialogs_to", 1)
>> #modparam("dialog", "accept_replicated_profiles", 1)
>> #modparam("dialog", "replicate_profiles_to", 1)
>> modparam("dialog", "profiles_with_value", "trunkCalls")
>> modparam("dialog", "options_ping_interval", 60)
>>
>> #### CLUSTERER module
>> loadmodule "clusterer.so"
>> modparam("clusterer", "db_url", "text:///usr/local/etc/opensips")
>> modparam("clusterer", "server_id", 2) #2 or 1 depending on node
>>
>>
>> # for initial requests
>> do_accounting("db", "cdr|missed", "acc");
>>
>>
>> Has anyone experienced similar problems ? Is there something that I am
>> missing ?
>>
>> Best regards,
>> Dawid
>>
>>
>
>
> _______________________________________________
> Users mailing listUsers at lists.opensips.orghttp://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20161103/c59df350/attachment-0001.htm>


More information about the Users mailing list