[OpenSIPS-Users] Critical:core:anchor_lump: offset exceeds message size (1033 > 1000), aborting -- exited by signal 6

Mon Mar 23 21:55:51 CET 2009

Yeah, unfortunately Nabble caused an issue with formatting in HTML with all
the tags ... Here's the contents:

-----------------------------------------------------------------

Hello all,

We're running Opensips in a production environment with about 10,000 users.
 We're currently running in a Registrar/Stateless proxy configuration --
very basic setup.

Anyways, to the point.  Currently, at random points throughout the day,
opensips will stop running and the following will show up in the log:

Mar 23 15:04:35 serbox2 /sbin/opensips[19181]:
ERROR:registrar:update_contacts: invalid cseq for aor <70154>
Mar 23 15:04:35 serbox2 /sbin/opensips[19178]:
ERROR:registrar:update_contacts: invalid cseq for aor <vh27126>
Mar 23 15:04:35 serbox2 /sbin/opensips[19180]:
ERROR:registrar:update_contacts: invalid cseq for aor <70154>
Mar 23 15:04:36 serbox2 /sbin/opensips[19181]: CRITICAL:core:anchor_lump:
offset exceeds message size (1033 > 1000) aborting...
Mar 23 15:04:36 serbox2 /sbin/opensips[19178]: CRITICAL:core:anchor_lump:
offset exceeds message size (1033 > 1000) aborting...
Mar 23 15:04:40 serbox2 /sbin/opensips[19180]: CRITICAL:core:anchor_lump:
offset exceeds message size (1033 > 1000) aborting...
Mar 23 15:04:48 serbox2 /sbin/opensips[19177]: INFO:core:handle_sigs: child
process 19181 exited by a signal 6
Mar 23 15:04:48 serbox2 /sbin/opensips[19177]: INFO:core:handle_sigs: core
was generated
Mar 23 15:04:48 serbox2 /sbin/opensips[19177]: INFO:core:handle_sigs:
terminating due to SIGCHLD
Mar 23 15:04:48 serbox2 /sbin/opensips[19179]: INFO:core:sig_usr: signal 15
received
Mar 23 15:04:48 serbox2 /sbin/opensips[19182]: INFO:core:sig_usr: signal 15
received
Mar 23 15:04:48 serbox2 /sbin/opensips[19183]: INFO:core:sig_usr: signal 15
received
Mar 23 15:05:24 serbox2 opensips: WARNING:core:fix_socket_list: could not
rev. resolve XXX.XXX.XXX.XXX

I'm not certain that this is a route script issue.  It seems like a message
header that isn't present is trying to be parsed or something along those
lines.  In trying to track this issue down, but would like a few pointers as
to where to be looking.

First, we've currently set the number of forks at 6 and the private memory
per process to 32 MB.  We're also using 512 MB of "shared" memory.

A) Are there any recommendations as to memory requirements for the amount of
users/subscriber base?  Again, we process somewhere around 2 million
registrations alone in an hour, and just statelessly forward requests unless
they're destined for a UAC, where we do a location lookup and then forward
to the handset.

I haven't really found any core statistics or guidelines for memory
parameters set.  If there is any better documentation on this, I'd greatly
appreciate a link.

B) Here's the basic contents of our route script (where the ip addresses
have been changed to *.domain for security purposes)

route
{
        if (!mf_process_maxfwd_header("10"))
        {
                sl_send_reply("483","Too Many Hops");
                exit;
        }

        if (!is_method("REGISTER|MESSAGE"))
        {
                if(nat_uac_test("19"))
                {
                        record_route(";nat=yes");
                }
                else
                {
                        record_route();
                }
        }

        if (is_method("REGISTER"))
        {
                if (!search("^Contact:\ +\*"))
                {
        #               xlog("Contact fix $ru\n");
                        fix_nated_register();
                        force_rport();
                }

                if (!save("location"))
                        sl_reply_error();
                exit;
        }

        if(src_ip == "sbc.domain")
        {
                rewritehost("ser.domain");

                if(!lookup("location"))
                {
                        sl_send_reply("404", "Not Found");
                        exit;
                }

                route(1);
                exit;
        }

        else
        {
                if(is_method("NOTIFY"))
                {
                        sl_send_reply("404", "Not Found");
                        exit;
                }

                rewritehost("sbc.domain");
                route(1);
                exit;
        }
}

# ---------  route[1] block  -------------
# Handle NAT-related issues on INVITES
#

route[1]
{
        #xlog("Beginning route $ru\n");
        if (is_method("INVITE"))
        {
                force_rport();
                fix_nated_contact();
                fix_nated_sdp("10");
        #       xlog("Routing 1 inside $ru\n");
                forward();
                exit;
        }
                force_rport();
                fix_nated_contact();

        #       xlog("Forward route $ru\n");

        if (!forward())
        {
               sl_reply_error();
        };
}

Is there a particular time where, for example, we're trying to fix a header
that might not exist, that could cause this particular issue?

As an FYI, we're currently running Opensips version 1.4.2, and are
considering a production release to 1.4.5 tonight to see if any of these
particular issues are fixed.  I see also that 1.5 was released as well, is
this a production worthy release?

Lastly, we see this a lot in the logs:

Mar 23 16:47:35 serbox2 /sbin/opensips[24586]:
ERROR:registrar:update_contacts: invalid cseq for aor <vh13669>
Mar 23 16:47:36 serbox2 /sbin/opensips[24586]:
ERROR:registrar:update_contacts: invalid cseq for aor <vh13669>

I've read something about Polycom user agent's attempting to reregister with
the same call-id but incrimenting the cseq ID, but I also think I've read
that there was a fix for the order this was looked up in in a version past
1.4.2.

Thanks very much for any help.

On Mon, Mar 23, 2009 at 4:53 PM, Iñaki Baz Castillo <ibc at aliax.net> wrote:

> Please note how your mail looks:
>
> El Lunes, 23 de Marzo de 2009, fabio4prez escribió:
> > Hello all,
> > <br><br>We're running Opensips in a production environment with about
> > 10,000 users. &nbsp;We're currently running in a Registrar/Stateless
> proxy
> > configuration -- very basic setup. <br><br>Anyways, to the point.
> > &nbsp;Currently, at random points throughout the day, opensips will stop
> > running and the following will show up in the log: <br><br>Mar 23
> 15:04:35
> > serbox2 /sbin/opensips[19181]: ERROR:registrar:update_contacts: invalid
> > cseq for aor &lt;70154&gt; <br>Mar 23 15:04:35 serbox2
> > /sbin/opensips[19178]: ERROR:registrar:update_contacts: invalid cseq for
> > aor &lt;vh27126&gt; <br>Mar 23 15:04:35 serbox2 /sbin/opensips[19180]:
> > ERROR:registrar:update_contacts: invalid cseq for aor &lt;70154&gt;
> <br>Mar
> > 23 15:04:36 serbox2 /sbin/opensips[19181]: CRITICAL:core:anchor_lump:
> > offset exceeds message size (1033 &gt; 1000) aborting... <br>Mar 23
> > 15:04:36 serbox2 /sbin/opensips[19178]: CRITICAL:core:anchor_lump: offset
> > exceeds message size (1033 &gt; 1000) aborting... <br>Mar 23 15:04:40
>
> [...]
>
> You mail has "Content-Type: text/plain;" but the text has obviously HTML
> tags.
> Please don't use HTML but just text plain.
>
> Regards.
>
>
>
> --
> Iñaki Baz Castillo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.opensips.org/pipermail/users/attachments/20090323/b56fb7c4/attachment.htm