[OpenSIPS-Devel] presence non-unique totag and more

Kennard_White at logitech.com Kennard_White at logitech.com
Tue Sep 28 20:47:45 CEST 2010


Hi Anca,

There are some "big" issues that I don't think I can solve. I'll try to
describe:

1. Need to decide if indexing is based upon totag (if globally unique) or
the 5-tuple of from_uri,from_tag,to_uri,to_tag,callid. Whichever is
decided, should be used informally for hash & db, for all ops (insert,
search and delete). Current code has mixture. My own "vote" is to make
totag unique (within the sever), much the way PUBLISH's etag is unique.
Then a subscription can be identified by a single key (the totag).

2. The unique totag needs to be generated *before* sending the response.
E.g., all the hash&db operations need to complete successfully before
sending reply message. With current code where the totag is generated as
side-effect of sending 2XX, the client thinks it has a good subscription
when in fact it has failed due to DB issue.

3. Consequence of above, my suggestion is generate totag within presence
module, complete DB ops, then send 2XX. Need new API so that the totag can
be passed into tm layer, instead of tm layer generating. This is bigger
change than I want to do (and even if I did, don't know if patch would be
accepted).

4. The totag generated by tm has a very long prefix that is GUID for the
opensips instance, and there is comment in code about this being used to
detect spirals. Is this used? If not, perhaps consider dropping this prefix
portion and making the dialog-specific portion longer (and unique).

5. I don't understand why the totag is generated the way it is. The method
used makes sense for the sl module: the to tag (because hash of prior hop
state) will be the same for replies generated for retransmits. But why do
this in tm module? The whole "point" of tm is that it keeps state to handle
retransmits. And the whole "point" of the totag is to provide a convenient
handle to look up dialog state. Clearly, I'm missing something here.

6. I only use the "presence_xml" module. The other presence modules (e.g.,
pua) use the low-level APIs of the presence module. I'm just not familiar
with how these other modules work, and cannot assess impact of changes to
core presence module.

7. The startup order is messy. Currently the db-restore is done from
mod_init(). This has two problems: (a) the other modules may or may not
have been initialized, which leads to non-determinism in how the
get_auth_status() is handled, and (b) the statistics variables are not
ready. Suggest moving db-restore into the "first" child_init, assuming
stats are ready then. But really need a thread-join for all the child-inits
prior to handling traffic. E.g., need to block all  PUB/SUB traffic until
restore is complete. (Ideally generating 5XX with retry-after header).
Don't really know how to implement this join/blocking.

As a consequence of all the above, I don't think I can ever give you a
final patch, or even any much better than what you have now. I just didn't
want you to think my patch was good or ready for production. Need to
resolve above big issues. If you decide to go with a global totag, then
some of my changes will need to be backed out. I'll happily deal with any
version control conflicts if it gets us closer to final solution.

Kennard



From:	"SourceForge.net" <noreply at sourceforge.net>
To:	noreply at sourceforge.net
Date:	09/28/2010 02:30 AM
Subject:	[OpenSIPS-Devel] [ opensips-Patches-3076805 ] presence
            non-unique	totag and more
Sent by:	devel-bounces at lists.opensips.org



Patches item #3076805, was opened at 2010-09-28 00:00
Message generated for change (Comment added) made by anca_vamanu
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=1086412&aid=3076805&group_id=232389


Please note that this message will contain a full copy of the comment
thread,
including the initial issue submission, for this request,
not just the latest update.
Category: modules
Group: trunk
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Kennard White (kennardwhite)
>Assigned to: Anca Vamanu (anca_vamanu)
Summary: presence non-unique totag and more

Initial Comment:
Attached is a bundle of changes to presence module. We deployed opensips1.6
and saw a number of different failures under production load. This patch
isn't finished quality: it is just snapshot of where I stand now. I thought
it worth sharing the issues I've seen so far. A clean solutions really
require larger change outside the scope of just the presence module. I'll
send email to the list about these ideas.

This includes:
* Partial bug fix for non-unique totags. The hash-table and DB code for
deleting subscriptions assumes that the totag is unique. But the totag has
only 16 bits of possible uniqueness per server (it is CRC of branch info).
IN practice, easy to get totag collisions after few thousand subscribers.
Patch includes fix to the hash-table side but the DB side is still broken.
For now, code just does nothing to DB when deleteting a subscription, and
just lets it expire out.
* Bug fix for expiring subscriptions from DB prior to updating.
Subscriptions that we renewd (via reSUB) are deleted from DB prior to
writing the new (unexpired) time into the DB. Change to do DB delete after
sync.
* Bug fix for "if ( foo & NO_UPDATE_DB) when NO_UPDATE_DB is zero value.
Note that at least one other module has same bug and is not patched here.
* Addition of statistics for presence module. The isn't cleanest of
implementations because the DB restore function runs before statistics are
available. Would be much cleaner if DB restore could run after stats are
available.
* Addition of new 'activeSubs' mi command to help trace down these
problems.
* More info in many of the log messages to help trace down these errors.
* Addition of 'override_481_code' as work-around for bad resiprocate
clients. We use this option to force a 5xx code instead of standard 481
since some versions of resip get stuck with 481.
* Change to NOTIFY code to node register a response callback function if
subscription is being terminated. No point is looking for 481 response if
sub is terminated.
* Fix for expires_offset as discussed previously to prevent subscriptions
walking to zero length.

----------------------------------------------------------------------

>Comment By: Anca Vamanu (anca_vamanu)
Date: 2010-09-28 12:30

Message:
Hi Kennard,

Thank you for this complex patch - some issues are really bugs that we
were not aware of and the additions seem useful. I will review this patch
and send comments if I have any. I will not apply it until you say it's the
final version.

Regards,
Anca

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=1086412&aid=3076805&group_id=232389


_______________________________________________
Devel mailing list
Devel at lists.opensips.org
http://lists.opensips.org/cgi-bin/mailman/listinfo/devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.opensips.org/pipermail/devel/attachments/20100928/4a61b3bc/attachment.htm 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
Url : http://lists.opensips.org/pipermail/devel/attachments/20100928/4a61b3bc/attachment.gif 


More information about the Devel mailing list