<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <tt>Hi all,<br>
      <br>
      For the sake of completion, here is the commit fixing the issue:<br>
         
<a class="moz-txt-link-freetext" href="https://github.com/OpenSIPS/opensips/commit/058cc22cb55dce9b890308b9f83a42a88691f2c8">https://github.com/OpenSIPS/opensips/commit/058cc22cb55dce9b890308b9f83a42a88691f2c8</a><br>
      <br>
      Thank you Yuval for the report and for investigating this!<br>
      <br>
      Best regards,<br>
    </tt>
    <pre class="moz-signature" cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a class="moz-txt-link-freetext" href="http://www.opensips-solutions.com">http://www.opensips-solutions.com</a>
OpenSIPS Bootcamp 2018
  <a class="moz-txt-link-freetext" href="http://opensips.org/training/OpenSIPS_Bootcamp_2018/">http://opensips.org/training/OpenSIPS_Bootcamp_2018/</a>
</pre>
    <div class="moz-cite-prefix">On 07/12/2018 04:07 PM, Yuval Dinari
      via Users wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CA+=GTAHYT_U+KdDs_zBRWrg=XkMMLJPx_S5msLmg5egQA6EFOA@mail.gmail.com">
      <div dir="ltr">Hi,
        <div>I have a state in which opensips gets into an unrecoverable
          bad state, in which some of the tcp children process are stuck
          waiting to acquire a lock which they never get.</div>
        <div>The issue occurs in the following load test scenario:</div>
        <div>
          <ol
style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial">
            <li style="margin-left:15px">About 25K clients register in
              TCP (but also happens with less)<br>
            </li>
            <li style="margin-left:15px">All the TCP connections become
              unresponsive (by blocking outgoing traffic on the test
              clients machine)</li>
            <li style="margin-left:15px">INVITEs are sent for each of
              those clients, putting their connection in retransmit mode</li>
            <li style="margin-left:15px">After a few minutes opensips
              gets into a bad state - some tcp children run at 90-100%
              cpu, no traffic is being sent from the machine (including
              OPTIONS pings)</li>
            <li style="margin-left:15px">After all the tcp connections
              die due to timeouts, opensips does not recover, the
              mentioned symptoms stay</li>
            <li style="margin-left:15px">After all the registered users
              are removed from internal table there's still no change</li>
          </ol>
          <div
style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial">
            <div>When attaching debugger to the problematic processes
              (with high cpu usage) we see that they're all stuck trying
              to get a lock which they never seem to get. Stack traces:</div>
          </div>
          <div
style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><br>
          </div>
          <div
style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial">
            <div><font color="#0000ff">#0  0x00007fd6b72d1bb7 in
                sched_yield () at ../sysdeps/unix/syscall-template.S:81</font></div>
            <div><font color="#0000ff">#1  0x0000000000549e65 in
                get_lock (lock=<optimized out>) at
                net/proto_tcp/../../net/../fastlock.h:221</font></div>
            <div><font color="#0000ff">#2  _tcp_write_on_socket
                (len=<optimized out>, buf=<optimized out>,
                fd=<optimized out>, c=<optimized out>) at
                net/proto_tcp/proto_tcp.c:724</font></div>
            <div><font color="#0000ff">#3  proto_tcp_send
                (send_sock=0x7ffd8e12c140, buf=0x0, len=399,
                to=0x7fd5c7ccdcc0, id=1) at
                net/proto_tcp/proto_tcp.c:922</font></div>
            <div><font color="#0000ff">#4  0x00007fd5a5cb7b30 in
                msg_send (msg=<optimized out>, len=<optimized
                out>, buf=<optimized out>, id=<optimized
                out>, to=<optimized out>, proto=<optimized
                out>, </font></div>
            <div><font color="#0000ff">    send_sock=0x7fd6a7208168) at
                ../../forward.h:123</font></div>
            <div><font color="#0000ff">#5  send_pr_buffer
                (rb=0x7fd5c7ccdca0, buf=0x7fd6a76b4a50, len=0,
                ctx=0xffffffffffffffff) at t_funcs.c:66</font></div>
          </div>
          <div
style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><br>
          </div>
          <div
style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial">And:</div>
          <div
style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><br>
          </div>
          <div
style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial">
            <div><font color="#0000ff">#0  0x00007fd6b72d1bb7 in
                sched_yield () at ../sysdeps/unix/syscall-template.S:81</font></div>
            <div><font color="#0000ff">#1  0x00000000005349b8 in
                get_lock (lock=<optimized out>) at
                net/../fastlock.h:221</font></div>
            <div><font color="#0000ff">#2  handle_io
                (event_type=<optimized out>, idx=<optimized
                out>, fm=<optimized out>) at
                net/net_tcp_proc.c:210</font></div>
            <div><font color="#0000ff">#3  io_wait_loop_epoll
                (repeat=287, t=<optimized out>, h=<optimized
                out>) at net/../io_wait_loop.h:280</font></div>
          </div>
          <div
style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><font
              color="#0000ff"><br>
            </font></div>
          <div
style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><font
              color="#000000">This traces look the same every time we
              attach.</font></div>
          <div
style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><font
              color="#000000">The machine opensips runs on has 4 cpus.</font></div>
          <div
style="font-size:small;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><font
              color="#000000">Thanks</font></div>
          <br class="gmail-Apple-interchange-newline">
          <br class="gmail-Apple-interchange-newline">
          <br>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
Users mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Users@lists.opensips.org">Users@lists.opensips.org</a>
<a class="moz-txt-link-freetext" href="http://lists.opensips.org/cgi-bin/mailman/listinfo/users">http://lists.opensips.org/cgi-bin/mailman/listinfo/users</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>