<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <font face="monospace">Hi Yury,<br>
      <br>
      Thanks for the info. I see that the stuck process (24) is an
      auto-scalled one (based on its id). Do you have SIP traffic from
      UDP to TCP or doing some HEP capturing for SIP ? I saw a recent
      similar report where a UDP auto-scalled worked got stuck when
      trying to do some communication with the TCP main/manager process
      (in order to handle a TCP operation).<br>
      <br>
      BTW, any chance to do a "opensips-cli -x trap" when you have that
      stuck process, just to see where is it stuck? and is it hard to
      reproduce? as I may ask you to extract some information from the
      running process....<br>
      <br>
      Regards,<br>
    </font>
    <pre class="moz-signature" cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a class="moz-txt-link-freetext" href="https://www.opensips-solutions.com">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a class="moz-txt-link-freetext" href="https://www.opensips.org/events/Summit-2022Athens/">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
    <div class="moz-cite-prefix">On 9/3/22 6:54 PM, Yury Kirsanov wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAD1_seuWHjaiwJPQkHdHU2ELX46V6T2crxNpwoPQDg=SygXD9g@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">Hi Bogdan,
        <div>This has finally happened, OS is stuck again in 100% for
          one of its processes. Here's the output of load: command:</div>
        <div><br>
        </div>
        <div><span class="gmail-im" style="color:rgb(80,0,80)">opensips-cli
            -x mi get_statistics load:<br>
            {<br>
          </span>    "load:load-proc-1": 0,<br>
              "load:load1m-proc-1": 0,<br>
              "load:load10m-proc-1": 0,<br>
              "load:load-proc-2": 0,<br>
              "load:load1m-proc-2": 0,<br>
              "load:load10m-proc-2": 0,<br>
              "load:load-proc-3": 0,<br>
              "load:load1m-proc-3": 0,<br>
              "load:load10m-proc-3": 0,<br>
              "load:load-proc-4": 0,<br>
              "load:load1m-proc-4": 0,<br>
              "load:load10m-proc-4": 0,<br>
              "load:load-proc-5": 0,<br>
              "load:load1m-proc-5": 0,<br>
              "load:load10m-proc-5": 8,<br>
              "load:load-proc-6": 0,<br>
              "load:load1m-proc-6": 0,<br>
              "load:load10m-proc-6": 6,<br>
              "load:load-proc-13": 0,<br>
              "load:load1m-proc-13": 0,<br>
              "load:load10m-proc-13": 0,<br>
              "load:load-proc-14": 0,<br>
              "load:load1m-proc-14": 0,<br>
              "load:load10m-proc-14": 0,<br>
              "load:load-proc-21": 0,<br>
              "load:load1m-proc-21": 0,<br>
              "load:load10m-proc-21": 0,<br>
              "load:load-proc-22": 0,<br>
              "load:load1m-proc-22": 0,<br>
              "load:load10m-proc-22": 0,<br>
              "load:load-proc-23": 0,<br>
              "load:load1m-proc-23": 0,<br>
              "load:load10m-proc-23": 0,<br>
              "load:load-proc-24": 100,<br>
              "load:load1m-proc-24": 100,<br>
              "load:load10m-proc-24": 100,<br>
              "load:load": 12,<br>
              "load:load1m": 12,<br>
              "load:load10m": 14,<br>
              "load:load-all": 10,<br>
              "load:load1m-all": 10,<br>
              "load:load10m-all": 11,<br>
              "load:processes_number": 13<br>
          }<br>
        </div>
        <div><br>
        </div>
        <div>As you can see, process 24 is consuming 100% of time for
          more than a minute already</div>
        <div><br>
        </div>
        <div>Here's the output of process list, it's a UDP socket
          listener on internal interface that's stuck at 100% load:</div>
        <div><br>
        </div>
        <div>opensips-cli -x mi ps<br>
          {<br>
              "Processes": [<br>
                  {<br>
                      "ID": 0,<br>
                      "PID": 5457,<br>
                      "Type": "attendant"<br>
                  },<br>
                  {<br>
                      "ID": 1,<br>
                      "PID": 5463,<br>
                      "Type": "HTTPD 10.x.x.x:8888"<br>
                  },<br>
                  {<br>
                      "ID": 2,<br>
                      "PID": 5464,<br>
                      "Type": "MI FIFO"<br>
                  },<br>
                  {<br>
                      "ID": 3,<br>
                      "PID": 5465,<br>
                      "Type": "time_keeper"<br>
                  },<br>
                  {<br>
                      "ID": 4,<br>
                      "PID": 5466,<br>
                      "Type": "timer"<br>
                  },<br>
                  {<br>
                      "ID": 5,<br>
                      "PID": 5467,<br>
                      "Type": "SIP receiver udp:10.x.x.x:5060"<br>
                  },<br>
                  {<br>
                      "ID": 6,<br>
                      "PID": 5470,<br>
                      "Type": "SIP receiver udp:10.x.x.x:5060"<br>
                  },<br>
                  {<br>
                      "ID": 13,<br>
                      "PID": 5477,<br>
                      "Type": "SIP receiver udp:103.x.x.x:7060"<br>
                  },<br>
                  {<br>
                      "ID": 14,<br>
                      "PID": 5478,<br>
                      "Type": "SIP receiver udp:103.x.x.x:7060"<br>
                  },<br>
                  {<br>
                      "ID": 21,<br>
                      "PID": 5485,<br>
                      "Type": "TCP receiver"<br>
                  },<br>
                  {<br>
                      "ID": 22,<br>
                      "PID": 5486,<br>
                      "Type": "Timer handler"<br>
                  },<br>
                  {<br>
                      "ID": 23,<br>
                      "PID": 5487,<br>
                      "Type": "TCP main"<br>
                  },<br>
                  {<br>
                      "ID": 24,<br>
                      "PID": 5759,<br>
                      "Type": "SIP receiver udp:10.x.x.x:5060"<br>
                  }<br>
              ]<br>
          }<br>
        </div>
        <div><br>
        </div>
        <div>opensips -V<br>
          version: opensips 3.2.8 (x86_64/linux)<br>
          flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP,
          PKG_MALLOC, Q_MALLOC, F_MALLOC, HP_MALLOC, DBG_MALLOC,
          FAST_LOCK-ADAPTIVE_WAIT<br>
          ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144,
          MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535<br>
          poll method support: poll, epoll, sigio_rt, select.<br>
          git revision: d2496fed5<br>
          main.c compiled on 16:17:53 Aug 24 2022 with gcc 9<br>
        </div>
        <div><br>
        </div>
        <div>This time server has some load but still it's not heavy at
          all plus I'm using async requests for REST queries.</div>
        <div><br>
        </div>
        <div>This is my autoscaling section:</div>
        <div><br>
        </div>
        <div># Scaling section<br>
          auto_scaling_profile = PROFILE_UDP_PUB<br>
               scale up to 16 on 70% for 4 cycles within 5<br>
               scale down to 2 on 20% for 5 cycles<br>
          <br>
          auto_scaling_profile = PROFILE_UDP_PRIV<br>
               scale up to 16 on 70% for 4 cycles within 5<br>
               scale down to 2 on 20% for 5 cycles<br>
          <br>
          auto_scaling_profile = PROFILE_TCP<br>
               scale up to 16 on 70% for 4 cycles within 5<br>
               scale down to 2 on 20% for 10 cycles<br>
        </div>
        <div><br>
        </div>
        <div>And that's how I apply it to sockets, I'm not applying it
          to UDP workers at all:<span class="gmail-im"
            style="color:rgb(80,0,80)"><br>
            <br>
            socket=udp:10.x.x.x:5060 use_auto_scaling_profile
            PROFILE_UDP_PRIV<br>
          </span></div>
        <div>socket=udp:103.x.x.x:7060 use_auto_scaling_profile
          PROFILE_UDP_PUB<br>
        </div>
        <div><br>
        </div>
        <div>tcp_workers = 1 use_auto_scaling_profile PROFILE_TCP<br>
        </div>
        <div><br>
        </div>
        <div>I can't get this process unstuck until I restart OpenSIPS.</div>
        <div><br>
        </div>
        <div>Just to add - if I turn off auto scaling and enable 16 UDP
          and 16 TCP workers and just specify sockets without any
          parameters - load goes to 0, see graph attached, load was at
          25% all the time until I restarted OpenSIPS in normal mode,
          then it's immediately 0:</div>
        <div><br>
        </div>
        <div><img src="cid:part1.544C7F6D.96D57147@opensips.org"
            alt="image.png" class="" width="562" height="177"><br>
        </div>
        <div><br>
        </div>
        <div>Here's an output of load: </div>
        <div><br>
        </div>
        <div>opensips-cli -x mi get_statistics load:<br>
          {<br>
              "load:load-proc-1": 0,<br>
              "load:load1m-proc-1": 0,<br>
              "load:load10m-proc-1": 0,<br>
              "load:load-proc-2": 0,<br>
              "load:load1m-proc-2": 0,<br>
              "load:load10m-proc-2": 0,<br>
              "load:load-proc-3": 0,<br>
              "load:load1m-proc-3": 0,<br>
              "load:load10m-proc-3": 0,<br>
              "load:load-proc-4": 0,<br>
              "load:load1m-proc-4": 0,<br>
              "load:load10m-proc-4": 0,<br>
              "load:load-proc-5": 0,<br>
              "load:load1m-proc-5": 0,<br>
              "load:load10m-proc-5": 2,<br>
              "load:load-proc-6": 0,<br>
              "load:load1m-proc-6": 0,<br>
              "load:load10m-proc-6": 0,<br>
              "load:load-proc-7": 0,<br>
              "load:load1m-proc-7": 0,<br>
              "load:load10m-proc-7": 1,<br>
              "load:load-proc-8": 0,<br>
              "load:load1m-proc-8": 0,<br>
              "load:load10m-proc-8": 1,<br>
              "load:load-proc-9": 0,<br>
              "load:load1m-proc-9": 0,<br>
              "load:load10m-proc-9": 1,<br>
              "load:load-proc-10": 0,<br>
              "load:load1m-proc-10": 0,<br>
              "load:load10m-proc-10": 0,<br>
              "load:load-proc-11": 0,<br>
              "load:load1m-proc-11": 0,<br>
              "load:load10m-proc-11": 3,<br>
              "load:load-proc-12": 0,<br>
              "load:load1m-proc-12": 0,<br>
              "load:load10m-proc-12": 2,<br>
              "load:load-proc-13": 0,<br>
              "load:load1m-proc-13": 0,<br>
              "load:load10m-proc-13": 1,<br>
              "load:load-proc-14": 0,<br>
              "load:load1m-proc-14": 0,<br>
              "load:load10m-proc-14": 3,<br>
              "load:load-proc-15": 0,<br>
              "load:load1m-proc-15": 0,<br>
              "load:load10m-proc-15": 2,<br>
              "load:load-proc-16": 0,<br>
              "load:load1m-proc-16": 0,<br>
              "load:load10m-proc-16": 1,<br>
              "load:load-proc-17": 0,<br>
              "load:load1m-proc-17": 0,<br>
              "load:load10m-proc-17": 4,<br>
              "load:load-proc-18": 0,<br>
              "load:load1m-proc-18": 0,<br>
              "load:load10m-proc-18": 2,<br>
              "load:load-proc-19": 0,<br>
              "load:load1m-proc-19": 0,<br>
              "load:load10m-proc-19": 3,<br>
              "load:load-proc-20": 0,<br>
              "load:load1m-proc-20": 0,<br>
              "load:load10m-proc-20": 2,<br>
              "load:load-proc-21": 0,<br>
              "load:load1m-proc-21": 0,<br>
              "load:load10m-proc-21": 0,<br>
              "load:load-proc-22": 0,<br>
              "load:load1m-proc-22": 0,<br>
              "load:load10m-proc-22": 0,<br>
              "load:load-proc-23": 0,<br>
              "load:load1m-proc-23": 0,<br>
              "load:load10m-proc-23": 0,<br>
              "load:load-proc-24": 0,<br>
              "load:load1m-proc-24": 0,<br>
              "load:load10m-proc-24": 0,<br>
              "load:load-proc-25": 0,<br>
              "load:load1m-proc-25": 0,<br>
              "load:load10m-proc-25": 0,<br>
              "load:load-proc-26": 0,<br>
              "load:load1m-proc-26": 0,<br>
              "load:load10m-proc-26": 0,<br>
              "load:load-proc-27": 0,<br>
              "load:load1m-proc-27": 0,<br>
              "load:load10m-proc-27": 0,<br>
              "load:load-proc-28": 0,<br>
              "load:load1m-proc-28": 0,<br>
              "load:load10m-proc-28": 0,<br>
              "load:load-proc-29": 0,<br>
              "load:load1m-proc-29": 0,<br>
              "load:load10m-proc-29": 0,<br>
              "load:load-proc-30": 0,<br>
              "load:load1m-proc-30": 0,<br>
              "load:load10m-proc-30": 0,<br>
              "load:load-proc-31": 0,<br>
              "load:load1m-proc-31": 0,<br>
              "load:load10m-proc-31": 0,<br>
              "load:load-proc-32": 0,<br>
              "load:load1m-proc-32": 0,<br>
              "load:load10m-proc-32": 0,<br>
              "load:load-proc-33": 0,<br>
              "load:load1m-proc-33": 0,<br>
              "load:load10m-proc-33": 0,<br>
              "load:load-proc-34": 0,<br>
              "load:load1m-proc-34": 0,<br>
              "load:load10m-proc-34": 0,<br>
              "load:load-proc-35": 3,<br>
              "load:load1m-proc-35": 0,<br>
              "load:load10m-proc-35": 0,<br>
              "load:load-proc-36": 0,<br>
              "load:load1m-proc-36": 0,<br>
              "load:load10m-proc-36": 0,<br>
              "load:load-proc-37": 0,<br>
              "load:load1m-proc-37": 0,<br>
              "load:load10m-proc-37": 0,<br>
              "load:load-proc-38": 0,<br>
              "load:load1m-proc-38": 0,<br>
              "load:load10m-proc-38": 0,<br>
              "load:load-proc-39": 0,<br>
              "load:load1m-proc-39": 0,<br>
              "load:load10m-proc-39": 0,<br>
              "load:load-proc-40": 0,<br>
              "load:load1m-proc-40": 0,<br>
              "load:load10m-proc-40": 0,<br>
              "load:load-proc-41": 0,<br>
              "load:load1m-proc-41": 0,<br>
              "load:load10m-proc-41": 0,<br>
              "load:load-proc-42": 0,<br>
              "load:load1m-proc-42": 0,<br>
              "load:load10m-proc-42": 0,<br>
              "load:load-proc-43": 0,<br>
              "load:load1m-proc-43": 0,<br>
              "load:load10m-proc-43": 0,<br>
              "load:load-proc-44": 0,<br>
              "load:load1m-proc-44": 0,<br>
              "load:load10m-proc-44": 0,<br>
              "load:load-proc-45": 0,<br>
              "load:load1m-proc-45": 0,<br>
              "load:load10m-proc-45": 0,<br>
              "load:load-proc-46": 0,<br>
              "load:load1m-proc-46": 0,<br>
              "load:load10m-proc-46": 0,<br>
              "load:load-proc-47": 0,<br>
              "load:load1m-proc-47": 0,<br>
              "load:load10m-proc-47": 0,<br>
              "load:load-proc-48": 0,<br>
              "load:load1m-proc-48": 0,<br>
              "load:load10m-proc-48": 0,<br>
              "load:load-proc-49": 0,<br>
              "load:load1m-proc-49": 0,<br>
              "load:load10m-proc-49": 0,<br>
              "load:load-proc-50": 0,<br>
              "load:load1m-proc-50": 0,<br>
              "load:load10m-proc-50": 0,<br>
              "load:load-proc-51": 0,<br>
              "load:load1m-proc-51": 0,<br>
              "load:load10m-proc-51": 0,<br>
              "load:load-proc-52": 0,<br>
              "load:load1m-proc-52": 0,<br>
              "load:load10m-proc-52": 0,<br>
              "load:load-proc-53": 0,<br>
              "load:load1m-proc-53": 0,<br>
              "load:load10m-proc-53": 0,<br>
              "load:load-proc-54": 0,<br>
              "load:load1m-proc-54": 0,<br>
              "load:load10m-proc-54": 0,<br>
              "load:load": 0,<br>
              "load:load1m": 0,<br>
              "load:load10m": 0,<br>
              "load:load-all": 0,<br>
              "load:load1m-all": 0,<br>
              "load:load10m-all": 0,<br>
              "load:processes_number": 55<br>
          }<br>
        </div>
        <div><br>
        </div>
        <div><br>
        </div>
        <div>Hope this is all the information you need! Thanks!</div>
        <div><br>
        </div>
        <div>Best regards,</div>
        <div>Yury.</div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Thu, Aug 25, 2022 at 8:24
          PM Bogdan-Andrei Iancu <<a
            href="mailto:bogdan@opensips.org" moz-do-not-send="true">bogdan@opensips.org</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div> <font face="monospace">Hi Yury,<br>
              <br>
              And when that scaling up happens, do you actually have
              traffic ? or your OpenSIPS is idle ?<br>
              <br>
              Also, could you run `</font><font face="monospace">opensips-cli
              -x mi get_statistics load:` (not the colon at the end).<br>
              <br>
              Regards,<br>
            </font>
            <pre cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a href="https://www.opensips-solutions.com" target="_blank" moz-do-not-send="true">https://www.opensips-solutions.com</a>
OpenSIPS Summit 27-30 Sept 2022, Athens
  <a href="https://www.opensips.org/events/Summit-2022Athens/" target="_blank" moz-do-not-send="true">https://www.opensips.org/events/Summit-2022Athens/</a></pre>
            <div>On 8/25/22 10:57 AM, Yury Kirsanov wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">Hi all,
                <div>I've ran into a strange issue, if I enable
                  autoscaler on OpenSIPS 3.2.x (tried 5,6,7 and now 8)
                  on a server without any load using 'socket' statement
                  like this:</div>
                <div><br>
                </div>
                <div>auto_scaling_profile = PROFILE_UDP_PRIV<br>
                       scale up to 16 on 30% for 4 cycles within 5<br>
                       scale down to 2 on 10% for 5 cycles<br>
                  <br>
                </div>
                <div>udp_workers=4<br>
                  <br>
                </div>
                <div>socket=udp:10.x.x.x:5060 use_auto_scaling_profile
                  PROFILE_UDP_PRIV<br>
                </div>
                <div><br>
                </div>
                <div>then after a while OpenSIPS load goes up to some
                  high number, autoscaler starts to open new processes
                  up to a maximum number specified in profile and them
                  load stays at that number, for example:</div>
                <div><br>
                </div>
                <div>opensips-cli -x mi get_statistics load<br>
                  {<br>
                      "load:load": 60<br>
                  }<br>
                </div>
                <div><br>
                </div>
                <div>It never changes and looks just 'stuck'.</div>
                <div><br>
                </div>
                <div>Any ideas why this is happening in my case? Or
                  should I file a bug report? Thanks.</div>
                <div><br>
                </div>
                <div>Regards,</div>
                <div>Yury.</div>
              </div>
              <br>
              <fieldset></fieldset>
              <pre>_______________________________________________
Users mailing list
<a href="mailto:Users@lists.opensips.org" target="_blank" moz-do-not-send="true">Users@lists.opensips.org</a>
<a href="http://lists.opensips.org/cgi-bin/mailman/listinfo/users" target="_blank" moz-do-not-send="true">http://lists.opensips.org/cgi-bin/mailman/listinfo/users</a>
</pre>
            </blockquote>
            <br>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <br>
  </body>
</html>