<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <font face="monospace">Hi,<br>
      <br>
      So even with the auto-scaling disabled, after a bit of a time you
      still get the TCP related issues? Do you use TLS in asyc mode? if
      yes, try to turn that off.<br>
      <br>
      Regards,<br>
    </font>
    <pre class="moz-signature" cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a class="moz-txt-link-freetext" href="https://www.opensips-solutions.com">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a class="moz-txt-link-freetext" href="https://www.opensips.org/events/Summit-2022Athens/">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
    <div class="moz-cite-prefix">On 10/12/22 1:36 AM, Yury Kirsanov
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAD1_sevbTqEU4KnCxMou2y3EXm7eW4xY4GeZhqJhLhwYhXYHTw@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">Hi Bogdan,
        <div>Yes, if I enable the autoscaler I immediately run into all
          sorts of issues with TCP. When it's off I'm just getting this
          issue from time to time and I have to restart OpenSIPS in that
          case, even though it's still working - part of the processes
          lock up and consume 100% CPU, but overall the system continues
          to service requests.</div>
        <div><br>
        </div>
        <div><a href="https://github.com/OpenSIPS/opensips/issues/2921"
            moz-do-not-send="true">https://github.com/OpenSIPS/opensips/issues/2921</a><br>
        </div>
        <div><br>
        </div>
        <div>Best regards,</div>
        <div>Yury.</div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Tue, Oct 11, 2022 at 10:59
          PM Bogdan-Andrei Iancu <<a
            href="mailto:bogdan@opensips.org" moz-do-not-send="true">bogdan@opensips.org</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div> <font face="monospace">Hi Yury,<br>
              <br>
              Is this still an issue ?<br>
              <br>
              Regards,<br>
            </font>
            <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" target="_blank" moz-do-not-send="true">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" target="_blank" moz-do-not-send="true">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
            <div>On 9/15/22 5:26 PM, Yury Kirsanov wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">Hi Bogdan,
                <div>Looks like I'm running into some issues with TCP
                  and autoscaling again...Now after a good start and
                  within about 5-10 minutes after OpenSIPS restart, even
                  if rate-limiter is enabled in iptables I'm getting a
                  lot of these errors:</div>
                <div><br>
                </div>
                <div>Sep 16 00:20:56 ERROR:core:send_fd: sendmsg would
                  block on 683: Resource temporarily unavailable<br>
                  Sep 16 00:20:56 ERROR:core:send2worker: send_fd failed<br>
                  Sep 16 00:20:56 ERROR:core:handle_new_connect: no TCP
                  workers available<br>
                </div>
                <div><br>
                </div>
                <div>And the number of registered users starts to drop.</div>
                <div><br>
                </div>
                <div>I've tried to change my autoscaler profile to be a
                  bit more aggressive:</div>
                <div><br>
                </div>
                <div>auto_scaling_profile = PROFILE_TCP<br>
                       scale up to 32 on 20% for 4 cycles within 5<br>
                       scale down to 4 on 10% for 10 cycles<br>
                </div>
                <div><br>
                </div>
                <div>But that didn't help. Current TCP settings:</div>
                <div><br>
                </div>
                <div>tcp_accept_aliases=0<br>
                  tcp_keepalive=1<br>
                  tcp_connect_timeout=1500<br>
                  tcp_keepinterval = 10<br>
                  tcp_keepidle = 10<br>
                  tcp_max_msg_time = 10<br>
                  tcp_workers = 4 use_auto_scaling_profile PROFILE_TCP<br>
                  tcp_max_connections = 4096<br>
                </div>
                <div><br>
                </div>
                <div># Proto TCP<br>
                  loadmodule "proto_tcp.so"<br>
                  modparam("proto_tcp", "tcp_async", 1)<br>
                  modparam("proto_tcp", "tcp_send_timeout", 1000)<br>
                  modparam("proto_tcp",
                  "tcp_async_local_connect_timeout", 500)<br>
                  modparam("proto_tcp", "tcp_async_local_write_timeout",
                  500)<br>
                  modparam("proto_tcp", "tcp_max_msg_chunks", 16)<br>
                  modparam("proto_tcp", "tcp_parallel_handling", 1)<br>
                </div>
                <div><br>
                </div>
                <div>I'm also setting TCP persistent flag before
                  mid_register_save (not sure which one to use - setflag
                  or setbflag so doing both):</div>
                <div><br>
                </div>
                <div>modparam("mid_registrar", "tcp_persistent_flag",
                  "TCP_PERSIST_REGISTRATIONS")<br>
                </div>
                <div><br>
                </div>
                <div>        if (is_method("REGISTER"))<br>
                              if ($socket_in(proto)!="udp")<br>
                              {<br>
                                  setflag("TCP_PERSIST_REGISTRATIONS");<br>
                                  setbflag("TCP_PERSIST_REGISTRATIONS");<br>
                              }<br>
                  <br>
                </div>
                <div>That didn't help. So I had to manually set
                  tcp_workers=32 and now it works fine. Not sure what's
                  going on here...</div>
                <div><br>
                </div>
                <div>Thanks and best regards,</div>
                <div>Yury.</div>
                <div><br>
                </div>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">On Thu, Sep 15, 2022
                  at 4:02 PM Bogdan-Andrei Iancu <<a
                    href="mailto:bogdan@opensips.org" target="_blank"
                    moz-do-not-send="true">bogdan@opensips.org</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0px 0px
                  0px 0.8ex;border-left:1px solid
                  rgb(204,204,204);padding-left:1ex">
                  <div> <font face="monospace">I'm glad it helped. keep
                      me posted please if the auto-scaling fix holds.<br>
                      <br>
                      Best regards,<br>
                    </font>
                    <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" target="_blank" moz-do-not-send="true">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" target="_blank" moz-do-not-send="true">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
                    <div>On 9/14/22 10:10 PM, Yury Kirsanov wrote:<br>
                    </div>
                    <blockquote type="cite">
                      <div dir="ltr">Hi Bogdan,
                        <div>Sorry to email directly to you again, but
                          just wanted to say a huge thank you for all
                          your great work in supporting OpenSIPS and its
                          users!</div>
                        <div><br>
                        </div>
                        <div>After adjusting TCP parameters my OpenSIPS
                          server can handle restarts easily without any
                          issues, even though I'm currently dropping all
                          the caches and dialogs and everything and not
                          using any rate-limit iptables rules.</div>
                        <div><br>
                        </div>
                        <div>Also, I've enabled the autoscaler and it
                          seem to work great this far, please see this
                          screenshot, you can see 79 processes before
                          the restart, then a restart and number of
                          processes immediately dropped to a very low
                          number even though it now keeps some load on
                          active processes:</div>
                        <div><br>
                        </div>
                        <div><img
                            src="cid:part8.F8877FA8.772ED159@opensips.org"
                            alt="image.png" style="margin-right: 25px;"
                            class=""><br>
                        </div>
                        <div><br>
                        </div>
                        <div>All the SIP devices were able to reconnect
                          successfully and seem to be stable at this
                          stage! No more memory leaks! Thanks again!</div>
                        <div><br>
                        </div>
                        <div>Best regards,</div>
                        <div>Yury.</div>
                      </div>
                      <br>
                      <div class="gmail_quote">
                        <div dir="ltr" class="gmail_attr">On Wed, Sep
                          14, 2022 at 10:58 PM Bogdan-Andrei Iancu <<a
                            href="mailto:bogdan@opensips.org"
                            target="_blank" moz-do-not-send="true">bogdan@opensips.org</a>>
                          wrote:<br>
                        </div>
                        <blockquote class="gmail_quote"
                          style="margin:0px 0px 0px
                          0.8ex;border-left:1px solid
                          rgb(204,204,204);padding-left:1ex">
                          <div> <font face="monospace">Hi Yury,<br>
                              <br>
                              You need to check the TCP setting and to
                              be sure your OpenSIPS will (1) not try to
                              perform TCP connect against destination
                              known not to be able to accept (like
                              TCP/WS end points behind NAT) - see the
                              tcp_no_new_conn_bflag [1] - or (2) not
                              block for long time while attempting a
                              connect - see the tcp_connect_timeout [2]
                              or consider enabling async [3].<br>
                              <br>
                              [1] <a
href="https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_no_new_conn_bflag"
                                target="_blank" moz-do-not-send="true">https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_no_new_conn_bflag</a><br>
                              [2] <a
href="https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_connect_timeout"
                                target="_blank" moz-do-not-send="true">https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_connect_timeout</a><br>
                              [3] <a
href="https://opensips.org/html/docs/modules/3.2.x/proto_tcp.html#idp168992"
                                target="_blank" moz-do-not-send="true">https://opensips.org/html/docs/modules/3.2.x/proto_tcp.html#idp168992</a><br>
                              <br>
                              Regards,<br>
                            </font>
                            <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" target="_blank" moz-do-not-send="true">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" target="_blank" moz-do-not-send="true">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
                            <div>On 9/13/22 12:01 PM, Yury Kirsanov
                              wrote:<br>
                            </div>
                            <blockquote type="cite">
                              <div dir="auto">Hi Bogdan,
                                <div dir="auto">Thanks for this update,
                                  but it looks like I can't check
                                  autoscaler because of this first issue
                                  with blocking TCP connect. Is there a
                                  way to resolve it? Am I doing
                                  something wrong? Or is that something
                                  to do with OpenSIPS code? As yes,
                                  you're right, as soon as I restart
                                  OpenSIPS having a lot of SIP devices
                                  trying to connect to it - it goes
                                  crazy, starts to consume memory and
                                  stops to forward packets sitting there
                                  at 100% load until it runs out of
                                  memory and segfaults. Sometimes I
                                  can't even restart it to come to
                                  normal state to make it work, it just
                                  loops into same crash whatever I try
                                  to do.</div>
                                <div dir="auto"><br>
                                </div>
                                <div dir="auto">I've compiled OpenSIPS
                                  3.3.1 with your patch and was able to
                                  start it but not sure, maybe I was
                                  just lucky this time.</div>
                                <div dir="auto"><br>
                                </div>
                                <div dir="auto">What should I do?
                                  Thanks!</div>
                                <div dir="auto"><br>
                                </div>
                                <div dir="auto">Best regards,</div>
                                <div dir="auto">Yury.</div>
                              </div>
                              <br>
                              <div class="gmail_quote">
                                <div dir="ltr" class="gmail_attr">On
                                  Tue, 13 Sept 2022, 18:56 Bogdan-Andrei
                                  Iancu, <<a
                                    href="mailto:bogdan@opensips.org"
                                    target="_blank"
                                    moz-do-not-send="true">bogdan@opensips.org</a>>
                                  wrote:<br>
                                </div>
                                <blockquote class="gmail_quote"
                                  style="margin:0px 0px 0px
                                  0.8ex;border-left:1px solid
                                  rgb(204,204,204);padding-left:1ex">
                                  <div> Hi Yury,<br>
                                    <br>
                                    it looks like you some multiple
                                    issues, overlapping here. The traps
                                    you sent here have nothing to do
                                    with the auto-scaling, but with a
                                    blocking TCP connect for SIP - most
                                    of the procs get blocked into a sync
                                    TCP connect.<br>
                                    <br>
                                    Regards,<br>
                                    <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
                                    <div>On 9/12/22 4:39 PM, Yury
                                      Kirsanov wrote:<br>
                                    </div>
                                    <blockquote type="cite">
                                      <div dir="ltr">Hi Bogdan,
                                        <div>I've applied the patch (had
                                          to find where to apply it
                                          manually for 3.2.8 downloaded
                                          from Web page, line 1568
                                          instead of 1652) and restarted
                                          the server with only about
                                          300-350 SIP devices and
                                          immediately got into same
                                          issue. I'm attaching two GDB
                                          dumps made within several
                                          minutes from each other.
                                          Autoscale was now OFF, please
                                          see my previous message as
                                          currently for some reason I'm
                                          experiencing lockups even when
                                          it's off :(</div>
                                      </div>
                                    </blockquote>
                                    <br>
                                    <blockquote type="cite">
                                      <div dir="ltr">
                                        <div>Best regards,</div>
                                        <div>Yury.</div>
                                      </div>
                                      <br>
                                      <div class="gmail_quote">
                                        <div dir="ltr"
                                          class="gmail_attr">On Mon, Sep
                                          12, 2022 at 7:48 PM
                                          Bogdan-Andrei Iancu <<a
                                            href="mailto:bogdan@opensips.org"
                                            rel="noreferrer"
                                            target="_blank"
                                            moz-do-not-send="true">bogdan@opensips.org</a>>
                                          wrote:<br>
                                        </div>
                                        <blockquote class="gmail_quote"
                                          style="margin:0px 0px 0px
                                          0.8ex;border-left:1px solid
                                          rgb(204,204,204);padding-left:1ex">
                                          <div> <font face="monospace">Hi
                                              Yuri,<br>
                                              <br>
                                              Could you give this patch
                                              a try? it should fix the
                                              blocking you experience
                                              (it should apply on 3.2
                                              too).<br>
                                              <br>
                                              Best regards,<br>
                                            </font>
                                            <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
                                            <div>On 9/7/22 2:54 PM,
                                              Bogdan-Andrei Iancu wrote:<br>
                                            </div>
                                            <blockquote type="cite"> <font
                                                face="monospace">Hi
                                                Yury,<br>
                                                <br>
                                                Thanks for the details
                                                info here - let me do a
                                                review of some code and
                                                run some tests, as at
                                                this point I have a good
                                                idea on the direction to
                                                dig into.<br>
                                                <br>
                                                I will update here.<br>
                                                <br>
                                                Best regards,<br>
                                              </font>
                                              <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
                                              <div>On 9/6/22 11:24 AM,
                                                Yury Kirsanov wrote:<br>
                                              </div>
                                              <blockquote type="cite">
                                                <div dir="auto">Hi
                                                  Bogdan,
                                                  <div dir="auto">Yes,
                                                    I'm listening on all
                                                    types of sockets
                                                    including UDP, TCP
                                                    and TLS on the
                                                    outside public
                                                    interface and then
                                                    forward traffic into
                                                    internal LAN via UDP
                                                    only.</div>
                                                  <div dir="auto"><br>
                                                  </div>
                                                  <div dir="auto">Previously
                                                    it was getting stuck
                                                    quite easily, now I
                                                    had to wait for a
                                                    while before this
                                                    actually happened.
                                                    I've routed part of
                                                    my customers to this
                                                    server to obtain
                                                    this result so I
                                                    will have to do that
                                                    again.</div>
                                                  <div dir="auto"><br>
                                                  </div>
                                                  <div dir="auto">As
                                                    soon as I see one of
                                                    the processes stuck
                                                    I'll dot the trap
                                                    command and send you
                                                    all the details
                                                    including processes
                                                    load, ps output and
                                                    so on.</div>
                                                  <div dir="auto"><br>
                                                  </div>
                                                  <div dir="auto">For
                                                    now I had to switch
                                                    autoscaling off and
                                                    just create many
                                                    listeners. Do I
                                                    understand correctly
                                                    that I need to
                                                    restart OpenSIPS in
                                                    order to apply
                                                    autoscaling profiles
                                                    and reload-routes is
                                                    not sufficient?</div>
                                                  <div dir="auto"><br>
                                                  </div>
                                                  <div dir="auto">Also,
                                                    do I need separate
                                                    UDP profiles for
                                                    public and private
                                                    interfaces? And do I
                                                    need to apply
                                                    autoscaling profile
                                                    just to a socket or
                                                    I need to specify
                                                    udp or tcp_workers
                                                    with autoscaler too?</div>
                                                  <div dir="auto"><br>
                                                  </div>
                                                  <div dir="auto">Thanks
                                                    and best regards,</div>
                                                  <div dir="auto">Yury.</div>
                                                </div>
                                                <br>
                                                <div class="gmail_quote">
                                                  <div dir="ltr"
                                                    class="gmail_attr">On
                                                    Tue, 6 Sept 2022,
                                                    18:18 Bogdan-Andrei
                                                    Iancu, <<a
                                                      href="mailto:bogdan@opensips.org"
                                                      rel="noreferrer"
                                                      target="_blank"
                                                      moz-do-not-send="true">bogdan@opensips.org</a>>
                                                    wrote:<br>
                                                  </div>
                                                  <blockquote
                                                    class="gmail_quote"
                                                    style="margin:0px
                                                    0px 0px
                                                    0.8ex;border-left:1px
                                                    solid
                                                    rgb(204,204,204);padding-left:1ex">
                                                    <div> <font
                                                        face="monospace">Hi
                                                        Yury,<br>
                                                        <br>
                                                        Thanks for the
                                                        info. I see that
                                                        the stuck
                                                        process (24) is
                                                        an auto-scalled
                                                        one (based on
                                                        its id). Do you
                                                        have SIP traffic
                                                        from UDP to TCP
                                                        or doing some
                                                        HEP capturing
                                                        for SIP ? I saw
                                                        a recent similar
                                                        report where a
                                                        UDP auto-scalled
                                                        worked got stuck
                                                        when trying to
                                                        do some
                                                        communication
                                                        with the TCP
                                                        main/manager
                                                        process (in
                                                        order to handle
                                                        a TCP
                                                        operation).<br>
                                                        <br>
                                                        BTW, any chance
                                                        to do a
                                                        "opensips-cli -x
                                                        trap" when you
                                                        have that stuck
                                                        process, just to
                                                        see where is it
                                                        stuck? and is it
                                                        hard to
                                                        reproduce? as I
                                                        may ask you to
                                                        extract some
                                                        information from
                                                        the running
                                                        process....<br>
                                                        <br>
                                                        Regards,<br>
                                                      </font>
                                                      <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" rel="noreferrer noreferrer" target="_blank" moz-do-not-send="true">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" rel="noreferrer noreferrer" target="_blank" moz-do-not-send="true">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
                                                      <div>On 9/3/22
                                                        6:54 PM, Yury
                                                        Kirsanov wrote:<br>
                                                      </div>
                                                    </div>
                                                  </blockquote>
                                                </div>
                                              </blockquote>
                                              <br>
                                              <br>
                                              <fieldset></fieldset>
                                              <pre>_______________________________________________
Users mailing list
<a href="mailto:Users@lists.opensips.org" rel="noreferrer" target="_blank" moz-do-not-send="true">Users@lists.opensips.org</a>
<a href="http://lists.opensips.org/cgi-bin/mailman/listinfo/users" rel="noreferrer" target="_blank" moz-do-not-send="true">http://lists.opensips.org/cgi-bin/mailman/listinfo/users</a>
</pre>
                                            </blockquote>
                                            <br>
                                          </div>
                                        </blockquote>
                                      </div>
                                    </blockquote>
                                    <br>
                                  </div>
                                </blockquote>
                              </div>
                            </blockquote>
                            <br>
                          </div>
                        </blockquote>
                      </div>
                    </blockquote>
                    <br>
                  </div>
                </blockquote>
              </div>
            </blockquote>
            <br>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <br>
  </body>
</html>