<div dir="ltr">Hi Bogdan,<div>Thanks a lot for your help and support! The only question I know have is why OpenSIPS was going into a crash if all TCP processes were blocked waiting for connection? It was starting to consume more and more memory and then it was crashing with a segfault upon reaching then -m memory parameter. I do understand that TCP listeners were in a blocking mode and were not able to do any work until the session could be fully established, not being able to forward any SIP packets, but isn't that a bug that OpenSIPS was starting to eat memory and then crash? Do I need to open a bug report on this? Thanks!</div><div><br></div><div>Best regards,</div><div>Yury.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Sep 14, 2022 at 10:58 PM Bogdan-Andrei Iancu <<a href="mailto:bogdan@opensips.org">bogdan@opensips.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div>
    <font face="monospace">Hi Yury,<br>
      <br>
      You need to check the TCP setting and to be sure your OpenSIPS
      will (1) not try to perform TCP connect against destination known
      not to be able to accept (like TCP/WS end points behind NAT) - see
      the tcp_no_new_conn_bflag [1] - or (2) not block for long time
      while attempting a connect - see the tcp_connect_timeout [2] or
      consider enabling async [3].<br>
      <br>
      [1]
<a href="https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_no_new_conn_bflag" target="_blank">https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_no_new_conn_bflag</a><br>
      [2]
<a href="https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_connect_timeout" target="_blank">https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_connect_timeout</a><br>
      [3]
      <a href="https://opensips.org/html/docs/modules/3.2.x/proto_tcp.html#idp168992" target="_blank">https://opensips.org/html/docs/modules/3.2.x/proto_tcp.html#idp168992</a><br>
      <br>
      Regards,<br>
    </font>
    <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" target="_blank">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" target="_blank">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
    <div>On 9/13/22 12:01 PM, Yury Kirsanov
      wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="auto">Hi Bogdan,
        <div dir="auto">Thanks for this update, but it looks like I
          can't check autoscaler because of this first issue with
          blocking TCP connect. Is there a way to resolve it? Am I doing
          something wrong? Or is that something to do with OpenSIPS
          code? As yes, you're right, as soon as I restart OpenSIPS
          having a lot of SIP devices trying to connect to it - it goes
          crazy, starts to consume memory and stops to forward packets
          sitting there at 100% load until it runs out of memory and
          segfaults. Sometimes I can't even restart it to come to normal
          state to make it work, it just loops into same crash whatever
          I try to do.</div>
        <div dir="auto"><br>
        </div>
        <div dir="auto">I've compiled OpenSIPS 3.3.1 with your patch and
          was able to start it but not sure, maybe I was just lucky this
          time.</div>
        <div dir="auto"><br>
        </div>
        <div dir="auto">What should I do? Thanks!</div>
        <div dir="auto"><br>
        </div>
        <div dir="auto">Best regards,</div>
        <div dir="auto">Yury.</div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Tue, 13 Sept 2022, 18:56
          Bogdan-Andrei Iancu, <<a href="mailto:bogdan@opensips.org" target="_blank">bogdan@opensips.org</a>> wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div> Hi Yury,<br>
            <br>
            it looks like you some multiple issues, overlapping here.
            The traps you sent here have nothing to do with the
            auto-scaling, but with a blocking TCP connect for SIP - most
            of the procs get blocked into a sync TCP connect.<br>
            <br>
            Regards,<br>
            <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" rel="noreferrer" target="_blank">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" rel="noreferrer" target="_blank">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
            <div>On 9/12/22 4:39 PM, Yury Kirsanov wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">Hi Bogdan,
                <div>I've applied the patch (had to find where to apply
                  it manually for 3.2.8 downloaded from Web page, line
                  1568 instead of 1652) and restarted the server with
                  only about 300-350 SIP devices and immediately got
                  into same issue. I'm attaching two GDB dumps made
                  within several minutes from each other. Autoscale was
                  now OFF, please see my previous message as currently
                  for some reason I'm experiencing lockups even when
                  it's off :(</div>
              </div>
            </blockquote>
            <br>
            <blockquote type="cite">
              <div dir="ltr">
                <div>Best regards,</div>
                <div>Yury.</div>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">On Mon, Sep 12, 2022
                  at 7:48 PM Bogdan-Andrei Iancu <<a href="mailto:bogdan@opensips.org" rel="noreferrer" target="_blank">bogdan@opensips.org</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                  <div> <font face="monospace">Hi Yuri,<br>
                      <br>
                      Could you give this patch a try? it should fix the
                      blocking you experience (it should apply on 3.2
                      too).<br>
                      <br>
                      Best regards,<br>
                    </font>
                    <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" rel="noreferrer" target="_blank">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" rel="noreferrer" target="_blank">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
                    <div>On 9/7/22 2:54 PM, Bogdan-Andrei Iancu wrote:<br>
                    </div>
                    <blockquote type="cite"> <font face="monospace">Hi
                        Yury,<br>
                        <br>
                        Thanks for the details info here - let me do a
                        review of some code and run some tests, as at
                        this point I have a good idea on the direction
                        to dig into.<br>
                        <br>
                        I will update here.<br>
                        <br>
                        Best regards,<br>
                      </font>
                      <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" rel="noreferrer" target="_blank">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" rel="noreferrer" target="_blank">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
                      <div>On 9/6/22 11:24 AM, Yury Kirsanov wrote:<br>
                      </div>
                      <blockquote type="cite">
                        <div dir="auto">Hi Bogdan,
                          <div dir="auto">Yes, I'm listening on all
                            types of sockets including UDP, TCP and TLS
                            on the outside public interface and then
                            forward traffic into internal LAN via UDP
                            only.</div>
                          <div dir="auto"><br>
                          </div>
                          <div dir="auto">Previously it was getting
                            stuck quite easily, now I had to wait for a
                            while before this actually happened. I've
                            routed part of my customers to this server
                            to obtain this result so I will have to do
                            that again.</div>
                          <div dir="auto"><br>
                          </div>
                          <div dir="auto">As soon as I see one of the
                            processes stuck I'll dot the trap command
                            and send you all the details including
                            processes load, ps output and so on.</div>
                          <div dir="auto"><br>
                          </div>
                          <div dir="auto">For now I had to switch
                            autoscaling off and just create many
                            listeners. Do I understand correctly that I
                            need to restart OpenSIPS in order to apply
                            autoscaling profiles and reload-routes is
                            not sufficient?</div>
                          <div dir="auto"><br>
                          </div>
                          <div dir="auto">Also, do I need separate UDP
                            profiles for public and private interfaces?
                            And do I need to apply autoscaling profile
                            just to a socket or I need to specify udp or
                            tcp_workers with autoscaler too?</div>
                          <div dir="auto"><br>
                          </div>
                          <div dir="auto">Thanks and best regards,</div>
                          <div dir="auto">Yury.</div>
                        </div>
                        <br>
                        <div class="gmail_quote">
                          <div dir="ltr" class="gmail_attr">On Tue, 6
                            Sept 2022, 18:18 Bogdan-Andrei Iancu, <<a href="mailto:bogdan@opensips.org" rel="noreferrer" target="_blank">bogdan@opensips.org</a>>
                            wrote:<br>
                          </div>
                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
                            <div> <font face="monospace">Hi Yury,<br>
                                <br>
                                Thanks for the info. I see that the
                                stuck process (24) is an auto-scalled
                                one (based on its id). Do you have SIP
                                traffic from UDP to TCP or doing some
                                HEP capturing for SIP ? I saw a recent
                                similar report where a UDP auto-scalled
                                worked got stuck when trying to do some
                                communication with the TCP main/manager
                                process (in order to handle a TCP
                                operation).<br>
                                <br>
                                BTW, any chance to do a "opensips-cli -x
                                trap" when you have that stuck process,
                                just to see where is it stuck? and is it
                                hard to reproduce? as I may ask you to
                                extract some information from the
                                running process....<br>
                                <br>
                                Regards,<br>
                              </font>
                              <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" rel="noreferrer noreferrer" target="_blank">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" rel="noreferrer noreferrer" target="_blank">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
                              <div>On 9/3/22 6:54 PM, Yury Kirsanov
                                wrote:<br>
                              </div>
                            </div>
                          </blockquote>
                        </div>
                      </blockquote>
                      <br>
                      <br>
                      <fieldset></fieldset>
                      <pre>_______________________________________________
Users mailing list
<a href="mailto:Users@lists.opensips.org" rel="noreferrer" target="_blank">Users@lists.opensips.org</a>
<a href="http://lists.opensips.org/cgi-bin/mailman/listinfo/users" rel="noreferrer" target="_blank">http://lists.opensips.org/cgi-bin/mailman/listinfo/users</a>
</pre>
                    </blockquote>
                    <br>
                  </div>
                </blockquote>
              </div>
            </blockquote>
            <br>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <br>
  </div>

</blockquote></div>