[OpenSIPS-Users] Critical:core:anchor_lump: offset exceeds message size (1033 > 1000), aborting -- exited by signal 6

Mon Mar 23 21:50:07 CET 2009

Hello all,
<br><br>We're running Opensips in a production environment with about 10,000 users. &nbsp;We're currently running in a Registrar/Stateless proxy configuration -- very basic setup.
<br><br>Anyways, to the point. &nbsp;Currently, at random points throughout the day, opensips will stop running and the following will show up in the log:
<br><br>Mar 23 15:04:35 serbox2 /sbin/opensips[19181]: ERROR:registrar:update_contacts: invalid cseq for aor &lt;70154&gt;
<br>Mar 23 15:04:35 serbox2 /sbin/opensips[19178]: ERROR:registrar:update_contacts: invalid cseq for aor &lt;vh27126&gt;
<br>Mar 23 15:04:35 serbox2 /sbin/opensips[19180]: ERROR:registrar:update_contacts: invalid cseq for aor &lt;70154&gt;
<br>Mar 23 15:04:36 serbox2 /sbin/opensips[19181]: CRITICAL:core:anchor_lump: offset exceeds message size (1033 &gt; 1000) aborting...
<br>Mar 23 15:04:36 serbox2 /sbin/opensips[19178]: CRITICAL:core:anchor_lump: offset exceeds message size (1033 &gt; 1000) aborting...
<br>Mar 23 15:04:40 serbox2 /sbin/opensips[19180]: CRITICAL:core:anchor_lump: offset exceeds message size (1033 &gt; 1000) aborting...
<br>Mar 23 15:04:48 serbox2 /sbin/opensips[19177]: INFO:core:handle_sigs: child process 19181 exited by a signal 6
<br>Mar 23 15:04:48 serbox2 /sbin/opensips[19177]: INFO:core:handle_sigs: core was generated
<br>Mar 23 15:04:48 serbox2 /sbin/opensips[19177]: INFO:core:handle_sigs: terminating due to SIGCHLD
<br>Mar 23 15:04:48 serbox2 /sbin/opensips[19179]: INFO:core:sig_usr: signal 15 received
<br>Mar 23 15:04:48 serbox2 /sbin/opensips[19182]: INFO:core:sig_usr: signal 15 received
<br>Mar 23 15:04:48 serbox2 /sbin/opensips[19183]: INFO:core:sig_usr: signal 15 received
<br>Mar 23 15:05:24 serbox2 opensips: WARNING:core:fix_socket_list: could not rev. resolve XXX.XXX.XXX.XXX
<br><br>I'm not certain that this is a route script issue. &nbsp;It seems like a message header that isn't present is trying to be parsed or something along those lines. &nbsp;In trying to track this issue down, but would like a few pointers as to where to be looking.
<br><br>First, we've currently set the number of forks at 6 and the private memory per process to 32 MB. &nbsp;We're also using 512 MB of &quot;shared&quot; memory.
<br><br>A) Are there any recommendations as to memory requirements for the amount of users/subscriber base? &nbsp;Again, we process somewhere around 2 million registrations alone in an hour, and just statelessly forward requests unless they're destined for a UAC, where we do a location lookup and then forward to the handset.
<br><br>I haven't really found any core statistics or guidelines for memory parameters set. &nbsp;If there is any better documentation on this, I'd greatly appreciate a link.
<br><br>B) Here's the basic contents of our route script (where the ip addresses have been changed to *.domain for security purposes)
<br><br>route
<br>{
<br>&nbsp; &nbsp; &nbsp; &nbsp; if (!mf_process_maxfwd_header(&quot;10&quot;))
<br>&nbsp; &nbsp; &nbsp; &nbsp; {
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sl_send_reply(&quot;483&quot;,&quot;Too Many Hops&quot;);
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; exit;
<br>&nbsp; &nbsp; &nbsp; &nbsp; }
<br><br>&nbsp; &nbsp; &nbsp; &nbsp; if (!is_method(&quot;REGISTER|MESSAGE&quot;))
<br>&nbsp; &nbsp; &nbsp; &nbsp; {
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if(nat_uac_test(&quot;19&quot;))
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; {
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; record_route(&quot;;nat=yes&quot;);
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; else
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; {
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; record_route();
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }
<br>&nbsp; &nbsp; &nbsp; &nbsp; }
<br><br>&nbsp; &nbsp; &nbsp; &nbsp; if (is_method(&quot;REGISTER&quot;))
<br>&nbsp; &nbsp; &nbsp; &nbsp; {
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if (!search(&quot;^Contact:\ +\*&quot;))
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; {
<br>&nbsp; &nbsp; &nbsp; &nbsp; # &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; xlog(&quot;Contact fix $ru\n&quot;);
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fix_nated_register();
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; force_rport();
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }
<br><br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if (!save(&quot;location&quot;))
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sl_reply_error();
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; exit;
<br>&nbsp; &nbsp; &nbsp; &nbsp; }
<br><br>&nbsp; &nbsp; &nbsp; &nbsp; if(src_ip == &quot;sbc.domain&quot;)
<br>&nbsp; &nbsp; &nbsp; &nbsp; {
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rewritehost(&quot;ser.domain&quot;);
<br><br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if(!lookup(&quot;location&quot;))
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; {
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sl_send_reply(&quot;404&quot;, &quot;Not Found&quot;);
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; exit;
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }
<br><br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; route(1);
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; exit;
<br>&nbsp; &nbsp; &nbsp; &nbsp; }
<br><br>&nbsp; &nbsp; &nbsp; &nbsp; else
<br>&nbsp; &nbsp; &nbsp; &nbsp; {
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if(is_method(&quot;NOTIFY&quot;))
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; {
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sl_send_reply(&quot;404&quot;, &quot;Not Found&quot;);
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; exit;
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }
<br><br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rewritehost(&quot;sbc.domain&quot;);
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; route(1);
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; exit;
<br>&nbsp; &nbsp; &nbsp; &nbsp; }
<br>}
<br><br># --------- &nbsp;route[1] block &nbsp;-------------
<br># Handle NAT-related issues on INVITES
<br>#
<br><br>route[1]
<br>{
<br>&nbsp; &nbsp; &nbsp; &nbsp; #xlog(&quot;Beginning route $ru\n&quot;);
<br>&nbsp; &nbsp; &nbsp; &nbsp; if (is_method(&quot;INVITE&quot;))
<br>&nbsp; &nbsp; &nbsp; &nbsp; {
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; force_rport();
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fix_nated_contact();
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fix_nated_sdp(&quot;10&quot;);
<br>&nbsp; &nbsp; &nbsp; &nbsp; # &nbsp; &nbsp; &nbsp; xlog(&quot;Routing 1 inside $ru\n&quot;);
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; forward();
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; exit;
<br>&nbsp; &nbsp; &nbsp; &nbsp; }
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; force_rport();
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fix_nated_contact();
<br><br>&nbsp; &nbsp; &nbsp; &nbsp; # &nbsp; &nbsp; &nbsp; xlog(&quot;Forward route $ru\n&quot;);
<br><br>&nbsp; &nbsp; &nbsp; &nbsp; if (!forward())
<br>&nbsp; &nbsp; &nbsp; &nbsp; {
<br>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;sl_reply_error();
<br>&nbsp; &nbsp; &nbsp; &nbsp; };
<br>}
<br><br><br><br>Is there a particular time where, for example, we're trying to fix a header that might not exist, that could cause this particular issue?
<br><br>As an FYI, we're currently running Opensips version 1.4.2, and are considering a production release to 1.4.5 tonight to see if any of these particular issues are fixed. &nbsp;I see also that 1.5 was released as well, is this a production worthy release?
<br><br>Lastly, we see this a lot in the logs:
<br><br>Mar 23 16:47:35 serbox2 /sbin/opensips[24586]: ERROR:registrar:update_contacts: invalid cseq for aor &lt;vh13669&gt;
<br>Mar 23 16:47:36 serbox2 /sbin/opensips[24586]: ERROR:registrar:update_contacts: invalid cseq for aor &lt;vh13669&gt;
<br><br>I've read something about Polycom user agent's attempting to reregister with the same call-id but incrimenting the cseq ID, but I also think I've read that there was a fix for the order this was looked up in in a version past 1.4.2.
<br><br>Thanks very much for any help.
-- 
View this message in context: http://n2.nabble.com/Critical%3Acore%3Aanchor_lump%3A-offset-exceeds-message-size-%281033-%3E-1000%29%2C-aborting----exited-by-signal-6-tp2523387p2523387.html
Sent from the OpenSIPS - Users mailing list archive at Nabble.com.