<div dir="ltr">Sorry, last email truncated by accident:<div><br></div><div>shmem:</div><div><br></div><div><div>[root@registrar2 : mnt]$ opensipsctl fifo get_statistics shmem:</div><div>shmem:total_size = 1073741824</div><div>
shmem:used_size = 102557512</div><div>shmem:real_used_size = 287344592</div><div>shmem:max_used_size = 374231688</div><div>shmem:free_size = 786397232</div><div>shmem:fragments = 266959</div></div><div><br></div><div>Memory dump counting fragments:</div>
<div><br></div><div><div>[root@registrar2 : mnt]$ sort -n opensips-alloc.log | uniq -c | sort -r</div><div> 11610 cfg.lex: addstr(954)</div><div> 6302 route_struct.c: mk_action(106)</div><div> 5232 route_struct.c: mk_elem(72)</div>
<div> 3650 cfg.y: yyparse(1710)</div><div> 2302 route_struct.c: mk_exp(56)</div><div> 783 sr_module.c: register_module(140)</div><div> 522 pvar.c: pv_add_extra(4239)</div><div> 441 map.c: map_get(141)</div><div>
440 map.c: map_get(152)</div><div> 420 cfg.y: yyparse(1286)</div><div> 387 pvar.c: pv_parse_format(3687)</div><div> 361 ip_addr.c: mk_net(56)</div><div> 336 route.c: fix_expr(194)</div><div> 234 transformations.c: parse_transformation(1715)</div>
<div> 174 transformations.c: tr_parse_string(2051)</div><div> 174 transformations.c: tr_parse_string(2040)</div><div> 145 name_alias.c: add_alias(61)</div><div> 145 name_alias.c: add_alias(59)</div><div> 145 cfg.y: mk_listen_id(2886)</div>
<div> 116 timer.c: new_sr_timer(157)</div><div> 116 script_var.c: add_var(61)</div><div> 116 script_var.c: add_var(54)</div><div> 116 mod_fix.c: fixup_regexp(303)</div><div> 116 cfg.y: yyparse(527)</div><div>
116 cfg.y: yyparse(1883)</div><div> 87 script_cb.c: add_callback(60)</div><div> 58 transformations.c: tr_parse_string(2086)</div><div> 58 transformations.c: tr_parse_string(2076)</div><div> 58 pvar.c: new_pv_context(4314)</div>
<div> 58 flags.c: get_flag_id_by_name(144)</div><div> 58 cfg.y: yyparse(509)</div><div> 37 cfg.y: yyparse(2403)</div><div> 29 pvar.c: pv_init_extra_list(4175)</div><div> 29 mod_fix.c: fixup_spve(832)</div>
<div> 29 map.c: map_create(81)</div><div> 29 cfg.lex: oss_push_yy_state(1149)</div><div> 29 cfg.lex: oss_push_yy_state(1090)</div><div> 29 avpops_parse.c: parse_avp_db_scheme(482)</div><div> 29 avpops_parse.c: parse_avp_db_scheme(476)</div>
<div> 29 avpops_parse.c: parse_avp_db_scheme(464)</div><div> 29 avpops_parse.c: parse_avp_db_scheme(404)</div><div> 29 avpops_db.c: avp_add_db_scheme(191)</div><div> 29 avpops_db.c: add_db_url(101)</div><div>
29 avpops.c: fixup_db_avp(286)</div><div> 29 auth_mod.c: generate_random_secret(192)</div><div> 16 udp_server.c: udp_rcv_loop(344)</div><div> 5 mod_fix.c: fixup_pvar(437)</div><div> 2 mi/mi_core.c: init_mi_uptime(64)</div>
<div> 2 cachedb/cachedb.c: cachedb_store_url(50)</div></div><div><br></div><div>So it literally looks like we aren't in an out of memory situation, but these log statements consistently pop up after running around 30000 concurrent calls for a period of about 2+ hours. Also, a graph of the shmem real_used_size during the test (server configured for 1024MB shmem) shows that we never record anything other than about ~275MB of actual memory used:</div>
<div><br></div><div><img src="http://graphite.qa.vocal-dev.com/render/?width=586&height=308&_salt=1380422257.822&target=opensips.statistics.shmem.real_used_size.registrar2"><br></div><div><br></div><div>Fortunately we figured out something else was causing our problems (and opened a ticket about it, around event_rabbitmq), but seeing these memory error messages is alarming because I don't understand what would be causing them or see anything wrong with the server.</div>
<div><br>Can you advise the next steps to try and troubleshoot this?<br><br>Thanks,</div><div><br></div><div>Bobby</div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Sat, Sep 28, 2013 at 10:31 PM, Bobby Smith <span dir="ltr"><<a href="mailto:bobby.smith@gmail.com" target="_blank">bobby.smith@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi Vlad, so I followed the advice for doing the memory dump on restart, and got back into a situation where I see regular memory errors in the log:<div>
<br></div><div><div>2013-09-29T02:09:09.822210+00:00 registrar2 /usr/local/opensips/sbin/opensips[25307]: ERROR:tm:sip_msg_cloner: no more share memory</div>
<div>2013-09-29T02:09:09.822233+00:00 registrar2 /usr/local/opensips/sbin/opensips[25307]: ERROR:tm:new_t: out of mem</div><div>2013-09-29T02:09:09.822273+00:00 registrar2 /usr/local/opensips/sbin/opensips[25307]: ERROR:tm:t_newtran: new_t failed</div>
<div>2013-09-29T02:09:09.822286+00:00 registrar2 /usr/local/opensips/sbin/opensips[25303]: ERROR:tm:sip_msg_cloner: no more share memory</div><div>2013-09-29T02:09:09.822325+00:00 registrar2 /usr/local/opensips/sbin/opensips[25303]: ERROR:tm:new_t: out of mem</div>
<div>2013-09-29T02:09:09.822335+00:00 registrar2 /usr/local/opensips/sbin/opensips[25303]: ERROR:tm:t_newtran: new_t failed</div></div><div><br></div><div>Looking at the statistics for shmem:</div><div><br></div><div><br>
</div><div><br></div><div>And the memory dump counting fragments:</div><div><br></div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Sep 20, 2013 at 7:43 AM, Vlad Paiu <span dir="ltr"><<a href="mailto:vladpaiu@opensips.org" target="_blank">vladpaiu@opensips.org</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>Hello,<br>
<br>
The log_next_state_dlg warning message means that you have
received a 1xx reply message after the 200OK reply has came in -
it's nothing critical if the communication was via UDP, since the
order of the UDP datagrams is not guaranteed.<br>
<br>
About the memory issues, it seems you've just ran out of shared
memory - the first build_req_buf_from_sip_req error message is
miss-leading, it's actually trying to allocate shared memory. <br>
<br>
In order to further debug this, can you please follow the steps at
<a href="http://www.opensips.org/Documentation/TroubleShooting-OutOfMem" target="_blank">http://www.opensips.org/Documentation/TroubleShooting-OutOfMem</a> and
send us the output of the memory dump ?<br>
<br>
Best Regards,<br>
<pre cols="72">Vlad Paiu
OpenSIPS Developer
<a href="http://www.opensips-solutions.com" target="_blank">http://www.opensips-solutions.com</a>
</pre><div><div>
On 19.09.2013 23:58, Bobby Smith wrote:<br>
</div></div></div>
<blockquote type="cite"><div><div>
<div dir="ltr">Greetings list,
<div><br>
</div>
<div>We're trying to track down some memory issues that we
originally thought were related to rabbitmq, but after
updating to the latest 1.9 I'm seeing a lot of these errors in
the log file:</div>
<div><br>
</div>
<div>
<div>2013-09-19T20:54:40.115582+00:00 registrar2
/usr/local/opensips/sbin/opensips[3916]:
CRITICAL:dialog:log_next_state_dlg: bogus event 2 in state 3
for dlg 0x2acc8b7780b0 [3248:527118168] with clid '<a href="mailto:3388204-13671@10.215.190.98" target="_blank">3388204-13671@10.215.190.98</a>'
and tags '3388204' '13665SIPpTag01563795'</div>
</div>
<div><br>
</div>
<div>I understand what this means (I think, it's around the
order that a 200 OK and ACK are processed), but repeating the
same test on a previous revision doesn't show these messages.</div>
<div>
<br>
</div>
<div>Also, after a short amount of time running the test:</div>
<div><br>
</div>
<div>
<div>2013-09-19T18:02:23.809205+00:00 registrar2
/usr/local/opensips/sbin/opensips[3918]:
ERROR:core:build_req_buf_from_sip_req: out of pkg memory</div>
<div>2013-09-19T18:02:23.809231+00:00 registrar2
/usr/local/opensips/sbin/opensips[3918]:
ERROR:tm:print_uac_request: no more shm_mem</div>
<div>2013-09-19T18:02:23.809242+00:00 registrar2
/usr/local/opensips/sbin/opensips[3917]:
ERROR:core:build_req_buf_from_sip_req: out of pkg memory</div>
<div>2013-09-19T18:02:23.809252+00:00 registrar2
/usr/local/opensips/sbin/opensips[3918]:
ERROR:tm:t_forward_nonack: failure to add branches</div>
<div>2013-09-19T18:02:23.809261+00:00 registrar2
/usr/local/opensips/sbin/opensips[3917]:
ERROR:tm:print_uac_request: no more shm_mem</div>
<div>2013-09-19T18:02:23.809271+00:00 registrar2
/usr/local/opensips/sbin/opensips[3917]:
ERROR:tm:t_forward_nonack: failure to add branches</div>
<div>2013-09-19T18:02:23.809279+00:00 registrar2
/usr/local/opensips/sbin/opensips[3918]:
ERROR:tm:_reply_light: failed to allocate shmem buffer</div>
<div>2013-09-19T18:02:23.809288+00:00 registrar2
/usr/local/opensips/sbin/opensips[3917]:
ERROR:tm:_reply_light: failed to allocate shmem buffer</div>
<div>2013-09-19T18:02:23.809297+00:00 registrar2
/usr/local/opensips/sbin/opensips[3916]: ERROR:tm:new_t: out
of mem</div>
<div>2013-09-19T18:02:23.809306+00:00 registrar2
/usr/local/opensips/sbin/opensips[3916]: ERROR:tm:t_newtran:
new_t failed</div>
<div>2013-09-19T18:02:23.809911+00:00 registrar2
/usr/local/opensips/sbin/opensips[3921]: ERROR:tm:new_t: out
of mem</div>
<div>2013-09-19T18:02:23.809942+00:00 registrar2
/usr/local/opensips/sbin/opensips[3917]: ERROR:tm:new_t: out
of mem</div>
<div>2013-09-19T18:02:23.809970+00:00 registrar2
/usr/local/opensips/sbin/opensips[3917]: ERROR:tm:t_newtran:
new_t failed</div>
<div>2013-09-19T18:02:23.809999+00:00 registrar2
/usr/local/opensips/sbin/opensips[3916]: ERROR:tm:new_t: out
of mem</div>
<div>2013-09-19T18:02:23.810037+00:00 registrar2
/usr/local/opensips/sbin/opensips[3916]: ERROR:tm:t_newtran:
new_t failed</div>
<div>2013-09-19T18:02:23.810068+00:00 registrar2
/usr/local/opensips/sbin/opensips[3921]: ERROR:tm:t_newtran:
new_t failed</div>
<div>2013-09-19T18:02:23.810880+00:00 registrar2
/usr/local/opensips/sbin/opensips[3919]:
ERROR:core:build_req_buf_from_sip_req: out of pkg memory</div>
<div>2013-09-19T18:02:23.810921+00:00 registrar2
/usr/local/opensips/sbin/opensips[3921]:
ERROR:dialog:dlg_add_leg_info: Failed to resize legs array</div>
</div>
<div><br>
</div>
<div>It seems very strange we'd run out of both package and
shared memory at the same time. When I dump statistics when
these messages are propagating in the log, I see:</div>
<div><br>
</div>
<div>According to statistics:</div>
<div><br>
</div>
<div>
<div>shmem:total_size = 1073741824</div>
<div>shmem:used_size = 168525088</div>
<div>shmem:real_used_size = 390522728</div>
<div>shmem:max_used_size = 1060997488</div>
<div>shmem:free_size = 683219096</div>
<div>shmem:fragments = 1106426</div>
</div>
<div><br>
</div>
<div>
<div>pkmem:0-real_used_size = 601136</div>
<div>pkmem:1-real_used_size = 610592</div>
<div>pkmem:2-real_used_size = 50858056</div>
<div>pkmem:3-real_used_size = 610416</div>
<div>pkmem:4-real_used_size = 610416</div>
<div>pkmem:5-real_used_size = 610416</div>
<div>pkmem:6-real_used_size = 610416</div>
<div>pkmem:7-real_used_size = 610416</div>
<div>pkmem:8-real_used_size = 610416</div>
<div>pkmem:9-real_used_size = 610416</div>
<div>pkmem:10-real_used_size = 610416</div>
<div>pkmem:11-real_used_size = 650864</div>
<div>pkmem:12-real_used_size = 654800</div>
<div>pkmem:13-real_used_size = 650944</div>
<div>pkmem:14-real_used_size = 651136</div>
<div>pkmem:15-real_used_size = 650704</div>
<div>pkmem:16-real_used_size = 650888</div>
<div>pkmem:17-real_used_size = 651712</div>
<div>pkmem:18-real_used_size = 651040</div>
<div>pkmem:19-real_used_size = 601136</div>
<div>pkmem:20-real_used_size = 618512</div>
<div>pkmem:21-real_used_size = 669680</div>
<div>pkmem:22-real_used_size = 669680</div>
<div>pkmem:23-real_used_size = 669680</div>
<div>pkmem:24-real_used_size = 669680</div>
<div>pkmem:25-real_used_size = 669680</div>
<div>pkmem:26-real_used_size = 669680</div>
<div>pkmem:27-real_used_size = 669680</div>
<div>pkmem:28-real_used_size = 669680</div>
<div>pkmem:29-real_used_size = 660464</div>
</div>
<div><br>
</div>
<div>And pkmem is configured for 64MB per process.</div>
<div><br>
</div>
<div>Any thoughts? It doesn't seem like transactions are
dropping or anything, we just see these strange issues in the
logs.</div>
<div><br>
</div>
<div>Thanks,</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
</div></div><pre>_______________________________________________
Users mailing list
<a href="mailto:Users@lists.opensips.org" target="_blank">Users@lists.opensips.org</a>
<a href="http://lists.opensips.org/cgi-bin/mailman/listinfo/users" target="_blank">http://lists.opensips.org/cgi-bin/mailman/listinfo/users</a>
</pre>
</blockquote>
<br>
</div>
<br>_______________________________________________<br>
Users mailing list<br>
<a href="mailto:Users@lists.opensips.org" target="_blank">Users@lists.opensips.org</a><br>
<a href="http://lists.opensips.org/cgi-bin/mailman/listinfo/users" target="_blank">http://lists.opensips.org/cgi-bin/mailman/listinfo/users</a><br>
<br></blockquote></div><br></div>
</div></div></blockquote></div><br></div>