<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Exchange Server">
<!-- converted from rtf -->
<style><!-- .EmailQuote { margin-left: 1pt; padding-left: 4pt; border-left: #800000 2px solid; } --></style>
</head>
<body>
<font face="Calibri" size="2"><span style="font-size:11pt;">
<div>Hi Liviu, </div>
<div> </div>
<div>I have taken latest 2.2 version after you pushed the timeout fix and performed the same load test. I did not observe any crash in that. </div>
<div> </div>
<div>- 10,000 calls is a nice number, but what was your CPS rate? Also, how many OpenSIPS processes did you run with?</div>
<div><font color="red">My CPS I sent is 55cps for 100K calls. Total process is 13. Children process = 8 and other 5 are helper process. </font></div>
<div> </div>
<div>- I noticed you enabled QM_MALLOC, which is good, but did you also enable DBG_MALLOC for it? This would help catch any memory corruption earliest possible.</div>
<div><font color="red">I will enable this for future testing. </font></div>
<div> </div>
<div>If I get green signal from my management, I will contribute code for REST_PUT. Can you share me the process to contribute code ?</div>
<div> </div>
<div><font color="red">Out of 2 times, I tested I observed the below issue for once. Before I used to have it for every test.</font></div>
<div> </div>
<ol style="margin:0;padding-left:36pt;">
<li>Tried to load 100,000 calls - But route[resume_http] is called only for 99986 calls.</li></ol>
<div style="text-indent:18pt;">Every time approximately 10-20 calls, route[resume_http] is not called. But if I see the tcpdump, I am seeing 100,000 HTTP request and 100,000 HTTP 200 OK responses.</div>
<div style="text-indent:18pt;">When printing the response in resume_http for every call-id, 10-20 calls response is not printed - which means resume is not called for these calls.</div>
<div style="text-indent:18pt;"> Am not filtering any response code.</div>
<div style="text-indent:18pt;"> </div>
<div>Any clue on this one?</div>
<div> </div>
<div>Regards,<br>
Agalya</div>
<div> </div>
<div>-----Original Message-----<br>
From: Liviu Chircu [<a href="mailto:liviu@opensips.org">mailto:liviu@opensips.org</a>]
<br>
Sent: Wednesday, October 12, 2016 5:22 AM<br>
To: Ramachandran, Agalya (Contractor) <Agalya_Ramachandran@comcast.com>; OpenSIPS users mailling list <users@lists.opensips.org><br>
Subject: Re: [OpenSIPS-Users] Pending OpenSIPS minor releases: Last minute bug fixes!</div>
<div> </div>
<div>Hi, Agalya!</div>
<div> </div>
<div>As you have already noticed, the timeout fix has made its way to 2.2, so from my perspective, all should be good now regarding the previous issues.</div>
<div> </div>
<div>Your tests are very interesting, but I need more info in order to isolate the cause of the crash. Here are some things to consider:</div>
<div> </div>
<div>- first and foremost: keep in mind that it's difficult to help debug code that I cannot see (REST PUT). If you were to make it crash while doing 10,000 REST POSTs, I would be able to jump in and attempt to replicate the crash.</div>
<div> </div>
<div>- do your workers have enough PKG memory? (use "opensipsctl fifo get_statistics pkmem:" during runtime to find out)</div>
<div> </div>
<div>- 10,000 calls is a nice number, but what was your CPS rate? Also, how many OpenSIPS processes did you run with?</div>
<div> </div>
<div>- I noticed you enabled QM_MALLOC, which is good, but did you also enable DBG_MALLOC for it? This would help catch any memory corruption earliest possible.</div>
<div> </div>
<div>Best regards,</div>
<div> </div>
<div>Liviu Chircu</div>
<div>OpenSIPS Developer</div>
<div><a href="http://www.opensips-solutions.com">http://www.opensips-solutions.com</a></div>
<div> </div>
<div>On 10.10.2016 20:09, Ramachandran, Agalya (Contractor) wrote:</div>
<div>> Hi Liviu/team,</div>
<div>></div>
<div>> There is one item pending from my mail thread - opensips crash when doing load test.</div>
<div>> Please find the details of that message below.</div>
<div>></div>
<div>> Hi Bogdan,</div>
<div>></div>
<div>> When I try to apply patch, I see all the code are already available in 2.2. So I did not add any patch.</div>
<div>> But did a code addition for 'REST_PUT' request in rest_client.c, </div>
<div>> rest_methods.c, rest_methohs.h Tried to perform load test. Observed 2 issues.</div>
<div>></div>
<div>> 1) Tried to load 10,000 calls - But route[resume_http] is called only for 9985 calls.</div>
<div>> Every time approximately 10-20 calls, route[resume_http] is not called. But if I see the tcpdump, I am seeing 10,000 HTTP request and 10,000 HTTP 200 OK responses.</div>
<div>> When printing the response in resume_http for every call-id, 10-20 calls response is not printed - which means resume is not called for these calls.</div>
<div>> Am not filtering any response code.</div>
<div>> route[resume_http] {</div>
<div>> xlog("L_INFO","Resp $rc in HTTP PUT!\n");</div>
<div>> xlog("L_INFO","route[relay] The content received from SM for $rm: [callId=$ci] : $var(body) in HTTP PUT\n");</div>
<div>> .....................</div>
<div>> }</div>
<div>> For first few calls, $rc is 1(ASYNC_DONE), for others $rc is </div>
<div>> -5(ASYNC_SYNC)</div>
<div>></div>
<div>> 2) Tried to load more than 10,000 calls. When it reaches 25,000 calls after that some point opensips crashes.</div>
<div>> I have two core dump generated as of now.</div>
<div>> Coredump1:</div>
<div>> --------------</div>
<div>> (gdb) bt</div>
<div>> #0 0x00007f8defacc840 in Curl_num_addresses () from </div>
<div>> /lib64/libcurl.so.4</div>
<div>> #1 0x00007f8defaf09ee in Curl_connecthost () from /lib64/libcurl.so.4</div>
<div>> #2 0x00007f8defae34ff in Curl_setup_conn () from /lib64/libcurl.so.4</div>
<div>> #3 0x00007f8defae370c in Curl_connect () from /lib64/libcurl.so.4</div>
<div>> #4 0x00007f8defaf32c0 in multi_runsingle () from /lib64/libcurl.so.4</div>
<div>> #5 0x00007f8defaf4121 in curl_multi_perform () from </div>
<div>> /lib64/libcurl.so.4</div>
<div>> #6 0x00007f8defd2b65b in start_async_http_req (msg=msg@entry=0x7f8df3740fc0, method=method@entry=REST_CLIENT_PUT,</div>
<div>> url=0x7f8df36ceb88 "<a href="http://test.comcast.net/RTCGSessionManager/rest/tel/session/createroom?">http://test.comcast.net/RTCGSessionManager/rest/tel/session/createroom?</a>", req_body=<optimized out>,</div>
<div>> req_ctype=<optimized out>, out_handle=out_handle@entry=0x7f8df3743300, body=body@entry=0x7f8df3743308, ctype=0x7f8df3743318)</div>
<div>> at rest_methods.c:265</div>
<div>> #7 0x00007f8defd31b56 in w_async_rest_put (msg=0x7f8df3740fc0, resume_f=0x7ffd4afd8c90, resume_param=0x7ffd4afd8c88,</div>
<div>> gp_url=<optimized out>, gp_body=<optimized out>, gp_ctype=<optimized out>, body_pv=0x7f8df36f85a8 "N",</div>
<div>> ctype_pv=0x7f8df36f8700 "N", code_pv=0x7f8df36f88c0 "N") at </div>
<div>> rest_client.c:544</div>
<div>> #8 0x00007f8df1031841 in t_handle_async (msg=0x7f8df3740fc0, </div>
<div>> a=0x7f8df36cf0c8, resume_route=2) at async.c:240</div>
<div>> #9 0x0000000000424056 in do_action (a=0x7f8df36cf2d0, </div>
<div>> msg=0x7f8df3740fc0) at action.c:1863</div>
<div>> #10 0x000000000041c330 in run_action_list (a=0x7f8df36cf2d0, </div>
<div>> msg=0x7f8df3740fc0) at action.c:172</div>
<div>> #11 0x0000000000420637 in do_action (a=0x7f8df36cf3f8, </div>
<div>> msg=0x7f8df3740fc0) at action.c:1108</div>
<div>> #12 0x000000000041c330 in run_action_list (a=0x7f8df36ce568, </div>
<div>> msg=0x7f8df3740fc0) at action.c:172</div>
<div>> #13 0x000000000041c1fd in run_actions (a=0x7f8df36ce568, </div>
<div>> msg=0x7f8df3740fc0) at action.c:137</div>
<div>> #14 0x000000000041ed55 in do_action (a=0x7f8df36ca080, </div>
<div>> msg=0x7f8df3740fc0) at action.c:745</div>
<div>> #15 0x000000000041c330 in run_action_list (a=0x7f8df36c9e78, </div>
<div>> msg=0x7f8df3740fc0) at action.c:172</div>
<div>> #16 0x0000000000420637 in do_action (a=0x7f8df36ca1a8, </div>
<div>> msg=0x7f8df3740fc0) at action.c:1108</div>
<div>> #17 0x000000000041c330 in run_action_list (a=0x7f8df36c2560, </div>
<div>> msg=0x7f8df3740fc0) at action.c:172</div>
<div>> #18 0x000000000041c1fd in run_actions (a=0x7f8df36c2560, </div>
<div>> msg=0x7f8df3740fc0) at action.c:137</div>
<div>> #19 0x000000000041c3fd in run_top_route (a=0x7f8df36c2560, </div>
<div>> msg=0x7f8df3740fc0) at action.c:204</div>
<div>> #20 0x000000000042bb64 in receive_msg (</div>
<div>> buf=0x7eb340 <buf.8031> "INVITE <a href="sip:++12001000004@10.0.0.1:5060">sip:++12001000004@10.0.0.1:5060</a> SIP/2.0\r\nTo: sut <<a href="sip:+19086774567@10.0.0.0.1:5060;user=phone">sip:+19086774567@10.0.0.0.1:5060;user=phone</a>>\r\nFrom: sipp
<<a href="sip:sipp@10.0.0.2:5060;user=phone">sip:sipp@10.0.0.2:5060;user=phone</a>>;tag=26939SIPpTag0014404\r\nCall-ID: 14"..., len=837,</div>
<div>> rcv_info=0x7ffd4afdae00, existing_context=0x0) at receive.c:208</div>
<div>> #21 0x0000000000521838 in udp_read_req (si=0x7f8df36bda50, </div>
<div>> bytes_read=0x7ffd4afdaec8) at net/proto_udp/proto_udp.c:192</div>
<div>> #22 0x000000000050d05b in handle_io (fm=0x7f8df371e128, idx=0, </div>
<div>> event_type=1) at net/net_udp.c:259</div>
<div>> #23 0x000000000050ba2b in io_wait_loop_epoll (h=0x80be40 <_worker_io>, </div>
<div>> t=1, repeat=0) at net/../io_wait_loop.h:221</div>
<div>> #24 0x000000000050d3db in udp_rcv_loop (si=0x7f8df36bda50) at </div>
<div>> net/net_udp.c:311</div>
<div>> #25 0x000000000050d973 in udp_start_processes (chd_rank=0x7d8128 </div>
<div>> <chd_rank.10725>, startup_done=0x0) at net/net_udp.c:375</div>
<div>> #26 0x0000000000495241 in main_loop () at main.c:671</div>
<div>> #27 0x0000000000497cec in main (argc=3, argv=0x7ffd4afdb1c8) at </div>
<div>> main.c:1271</div>
<div>></div>
<div>> Coredump2:</div>
<div>> --------------</div>
<div>> (gdb) bt</div>
<div>> #0 0x00000000004affeb in qm_detach_free (qm=0x7f5aa3ff4010, </div>
<div>> frag=0x7f5aa40c50f0) at mem/q_malloc.c:281</div>
<div>> #1 0x00000000004b1233 in qm_free (qm=0x7f5aa3ff4010, p=0x7f5aa40c50b0, file=0x7f5aa069fbf5 "rest_client.c",</div>
<div>> func=0x7f5aa06a01fd <__FUNCTION__.8581> "osips_free", line=205) </div>
<div>> at mem/q_malloc.c:494</div>
<div>> #2 0x00007f5aa04774f6 in curl_thread_create_thunk () from </div>
<div>> /lib64/libcurl.so.4</div>
<div>> #3 0x00007f5a9edb9dc5 in start_thread () from /lib64/libpthread.so.0</div>
<div>> #4 0x00007f5aa42eb28d in clone () from /lib64/libc.so.6</div>
<div>></div>
<div>> Regards,</div>
<div>> Agalya</div>
<div>></div>
<div>></div>
<div>></div>
<div>> -----Original Message-----</div>
<div>> From: <a href="mailto:users-bounces@lists.opensips.org">users-bounces@lists.opensips.org</a> </div>
<div>> [<a href="mailto:users-bounces@lists.opensips.org">mailto:users-bounces@lists.opensips.org</a>] On Behalf Of Liviu Chircu</div>
<div>> Sent: Monday, October 10, 2016 4:53 AM</div>
<div>> To: OpenSIPS devel mailling list <<a href="mailto:devel@lists.opensips.org">devel@lists.opensips.org</a>>; OpenSIPS </div>
<div>> users mailling list <<a href="mailto:users@lists.opensips.org">users@lists.opensips.org</a>></div>
<div>> Subject: [OpenSIPS-Users] Pending OpenSIPS minor releases: Last minute bug fixes!</div>
<div>></div>
<div>> Hi, everyone!</div>
<div>></div>
<div>> A new series of OpenSIPS minor releases is planned for next week, October 19th.</div>
<div>></div>
<div>> If you have any pending GitHub issues / mailing list bug-threads in OpenSIPS 1.11, 2.1 or 2.2 which are still not resolved, this would be a good time to bump them!</div>
<div>></div>
<div>> Best regards,</div>
<div>></div>
<div>> --</div>
<div>> Liviu Chircu</div>
<div>> OpenSIPS Developer</div>
<div>> <a href="http://www.opensips-solutions.com">http://www.opensips-solutions.com</a></div>
<div>></div>
<div>></div>
<div>> _______________________________________________</div>
<div>> Users mailing list</div>
<div>> <a href="mailto:Users@lists.opensips.org">Users@lists.opensips.org</a></div>
<div>> <a href="http://lists.opensips.org/cgi-bin/mailman/listinfo/users">http://lists.opensips.org/cgi-bin/mailman/listinfo/users</a></div>
<div>></div>
<div> </div>
<div> </div>
<div> </div>
</span></font>
</body>
</html>