<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <tt>Hi Ben,<br>
      <br>
      <tt>Sorry for <tt>not being able to answer you before sending the
          new set of BTs. Indeed<tt>, <tt>getting the corefile of <tt>only
                one process will do it as the locks (and debug info) are
                in the</tt></tt></tt></tt></tt> shared memory. So, the
      deadlock <tt>happen<tt>s again, do the "opensipsctl trap" and get
          the corefile of one <tt>process (ideal<tt>ly an UDP worker -
              get its pid via "opensipsctl fifo ps")</tt></tt></tt></tt>.<br>
      Keep the core as we will have to <tt>dig into it together :).</tt><br>
      <br>
      Man<tt>y thanks,</tt><br>
      <br>
    </tt>
    <pre class="moz-signature" cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a class="moz-txt-link-freetext" href="http://www.opensips-solutions.com">http://www.opensips-solutions.com</a>
OpenSIPS Bootcamp 2018
  <a class="moz-txt-link-freetext" href="http://opensips.org/training/OpenSIPS_Bootcamp_2018/">http://opensips.org/training/OpenSIPS_Bootcamp_2018/</a>
</pre>
    <div class="moz-cite-prefix">On 11/06/2018 10:14 PM, Ben Newlin
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:23D5BF0D-9B02-4EED-BA2F-3E18D1327887@genesys.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style><!--
/* Font Definitions */
@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"Apple Color Emoji";
        panose-1:0 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}
@font-face
        {font-family:"Courier New \,serif";
        panose-1:2 7 3 9 2 2 5 2 4 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:#954F72;
        text-decoration:underline;}
pre
        {mso-style-priority:99;
        mso-style-link:"HTML Preformatted Char";
        margin:0in;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";}
tt
        {mso-style-priority:99;
        font-family:"Courier New";}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0in;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:.5in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Calibri",sans-serif;}
p.msonormal0, li.msonormal0, div.msonormal0
        {mso-style-name:msonormal;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
span.HTMLPreformattedChar
        {mso-style-name:"HTML Preformatted Char";
        mso-style-priority:99;
        mso-style-link:"HTML Preformatted";
        font-family:Consolas;}
span.EmailStyle22
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;}
span.EmailStyle23
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;}
span.EmailStyle24
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;}
span.EmailStyle25
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;}
span.EmailStyle26
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:758982242;
        mso-list-type:hybrid;
        mso-list-template-ids:-1638617650 -1137403934 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
        {mso-level-start-at:0;
        mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Symbol;
        mso-fareast-font-family:Calibri;
        mso-bidi-font-family:"Times New Roman";}
@list l0:level2
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:"Courier New";}
@list l0:level3
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Wingdings;}
@list l0:level4
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Symbol;}
@list l0:level5
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:"Courier New";}
@list l0:level6
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Wingdings;}
@list l0:level7
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Symbol;}
@list l0:level8
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:"Courier New";}
@list l0:level9
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Wingdings;}
@list l1
        {mso-list-id:1050306637;
        mso-list-template-ids:1238286436;}
@list l1:level1
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level2
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:1.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level3
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:1.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level4
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:2.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level5
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:2.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level6
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:3.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level7
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:3.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level8
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:4.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level9
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:4.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2
        {mso-list-id:1618219800;
        mso-list-template-ids:-1932244140;}
@list l2:level1
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level2
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:1.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level3
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:1.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level4
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:2.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level5
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:2.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level6
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:3.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level7
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:3.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level8
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:4.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level9
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:4.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
ol
        {margin-bottom:0in;}
ul
        {margin-bottom:0in;}
--></style>
      <div class="WordSection1">
        <p class="MsoNormal"><span style="font-size:11.0pt">Bogdan,<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span style="font-size:11.0pt">I am trying
            to obtain this information for you but I am having trouble
            getting the core files. Is it really necessary to kill every
            opensips process? This generates almost 40 core files and
            each is quite large (~1GB). I simply don’t have that disk
            space currently. I can make a change to get more but it is
            slowing the process. Would it be sufficient to get just one
            core file?<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span style="font-size:11.0pt">Also,
            runtime inspection with gdb is possible in this case if you
            can provide me with the commands you would want to see. I
            would need very specific commands as I am not very familiar
            with gdb.<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span style="font-size:11.0pt;color:black">Ben
            Newlin </span>
          <span style="font-size:11.0pt"><o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
        <div style="border:none;border-top:solid #B5C4DF
          1.0pt;padding:3.0pt 0in 0in 0in">
          <p class="MsoNormal"><b><span style="color:black">From: </span></b><span
              style="color:black">Bogdan-Andrei Iancu
              <a class="moz-txt-link-rfc2396E" href="mailto:bogdan@opensips.org"><bogdan@opensips.org></a><br>
              <b>Date: </b>Thursday, November 1, 2018 at 1:29 PM<br>
              <b>To: </b>Ben Newlin <a class="moz-txt-link-rfc2396E" href="mailto:Ben.Newlin@genesys.com"><Ben.Newlin@genesys.com></a>,
              OpenSIPS users mailling list
              <a class="moz-txt-link-rfc2396E" href="mailto:users@lists.opensips.org"><users@lists.opensips.org></a><br>
              <b>Subject: </b>Re: [OpenSIPS-Users] CPU 100% with TCP<o:p></o:p></span></p>
        </div>
        <div>
          <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
        </div>
        <p class="MsoNormal"><tt><span style="font-size:10.0pt">Hi Ben,</span></tt><span
            style="font-size:10.0pt;font-family:"Courier New""><br>
            <br>
            <tt>First be sure you have the DBG_LOCK option compiled in.
              Do the "opensips -V" and see the output flags.</tt><br>
            <br>
            <tt>Next step will be to force an SIGSEGV to opensips
              (killall -11 opensips) when the deadlock occurs - I need a
              core file to inspect (assuming that runtime inspection
              with gdb is not possible).</tt><br>
            <br>
            <tt>Regards,</tt><br>
            <br>
          </span><o:p></o:p></p>
        <pre>Bogdan-Andrei Iancu<o:p></o:p></pre>
        <pre><o:p> </o:p></pre>
        <pre>OpenSIPS Founder and Developer<o:p></o:p></pre>
        <pre>  <a href="http://www.opensips-solutions.com" moz-do-not-send="true">http://www.opensips-solutions.com</a><o:p></o:p></pre>
        <pre>OpenSIPS Bootcamp 2018<o:p></o:p></pre>
        <pre>  <a href="http://opensips.org/training/OpenSIPS_Bootcamp_2018/" moz-do-not-send="true">http://opensips.org/training/OpenSIPS_Bootcamp_2018/</a><o:p></o:p></pre>
        <div>
          <p class="MsoNormal">On 10/31/2018 09:07 PM, Ben Newlin wrote:<o:p></o:p></p>
        </div>
        <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
          <p class="MsoNormal"><span style="font-size:11.0pt">Bogdan,</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:11.0pt">For the
              first test I have done as you suggested and disabled only
              async operation for HEP, so it is still using TCP. I will
              send you the trap info directly as it is too large. I also
              compiled with the DBG_LOCK option, but am unsure whether
              that extra information will be available in the trap
              output or do you need something else?</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:11.0pt">I am now
              going to switch HEP to use UDP to mirror our production
              environment and try to reproduce again. Wish me luck!
            </span><span style="font-size:11.0pt;font-family:"Apple
              Color Emoji",serif">☺</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;color:black">Ben Newlin </span>
            <o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
          <div style="border:none;border-top:solid #B5C4DF
            1.0pt;padding:3.0pt 0in 0in 0in">
            <p class="MsoNormal"><b><span style="color:black">From: </span></b><span
                style="color:black">Bogdan-Andrei Iancu
                <a href="mailto:bogdan@opensips.org"
                  moz-do-not-send="true"><bogdan@opensips.org></a><br>
                <b>Date: </b>Monday, October 29, 2018 at 2:19 PM<br>
                <b>To: </b>Ben Newlin <a
                  href="mailto:Ben.Newlin@genesys.com"
                  moz-do-not-send="true"><Ben.Newlin@genesys.com></a>,
                OpenSIPS users mailling list
                <a href="mailto:users@lists.opensips.org"
                  moz-do-not-send="true"><users@lists.opensips.org></a><br>
                <b>Subject: </b>Re: [OpenSIPS-Users] CPU 100% with TCP</span><o:p></o:p></p>
          </div>
          <div>
            <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
          </div>
          <p class="MsoNormal">Hi Ben,<br>
            <br>
            I checked the error trace and it should not leave any
            dangling lock (due mishandled error). Before disabling HEP,
            try to disable the async support for HEP.<br>
            <br>
            If you claim that the same 100% CPU happens with HEP + UDP,
            send me a trap for that too, as in the previous case, the
            deadlock was exclusively HEP + TCP related.
            <br>
            <br>
            Anyhow, as the original trap showed a deadlock, next step
            will be to recompile with the DBG_LOCK option - this enables
            extra code to debug/troubleshoot locking related issues -
            are you able to do it?<br>
            <br>
            Regards,<br>
            <br>
            <br>
            <o:p></o:p></p>
          <pre>Bogdan-Andrei Iancu<o:p></o:p></pre>
          <pre> <o:p></o:p></pre>
          <pre>OpenSIPS Founder and Developer<o:p></o:p></pre>
          <pre>  <a href="http://www.opensips-solutions.com" moz-do-not-send="true">http://www.opensips-solutions.com</a><o:p></o:p></pre>
          <pre>OpenSIPS Bootcamp 2018<o:p></o:p></pre>
          <pre>  <a href="http://opensips.org/training/OpenSIPS_Bootcamp_2018/" moz-do-not-send="true">http://opensips.org/training/OpenSIPS_Bootcamp_2018/</a><o:p></o:p></pre>
          <div>
            <p class="MsoNormal">On 10/26/2018 04:14 PM, Ben Newlin
              wrote:<o:p></o:p></p>
          </div>
          <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
            <p class="MsoNormal"><span style="font-size:11.0pt">Bogdan,</span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt">Actually,
                yes we do. Looking back I can see these errors just
                before the issue occurs:</span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
            <p class="MsoNormal"><span
                style="font-size:10.0pt;font-family:"Courier New
                ,serif",serif">Oct 24 19:00:36 [5700]
                ERROR:proto_hep:send_hep_message: Cannot send hep
                message!</span><o:p></o:p></p>
            <p class="MsoNormal"><span
                style="font-size:10.0pt;font-family:"Courier New
                ,serif",serif">Oct 24 19:00:36 [5700]
                ERROR:proto_hep:msg_send: send() to 10.32.163.211:9061
                for proto hep_tcp/9 failed</span><o:p></o:p></p>
            <p class="MsoNormal"><span
                style="font-size:10.0pt;font-family:"Courier New
                ,serif",serif">Oct 24 19:00:36 [5700]
                ERROR:proto_hep:hep_tcp_send: failed to send</span><o:p></o:p></p>
            <p class="MsoNormal"><span
                style="font-size:10.0pt;font-family:"Courier New
                ,serif",serif">Oct 24 19:00:36 [5700]
                ERROR:proto_hep:async_tsend_stream: Failed first TCP
                async send : (32) Broken pipe</span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt">I will
                try disabling HEP and see if we can reproduce.</span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt">Just for
                information, I have been reproducing the issue in our
                testing environment which uses TCP for HEP, however the
                issue is occurring in our production environment as well
                which is still using UDP for HEP.</span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
            <p class="MsoNormal"><span
                style="font-size:11.0pt;color:black">Ben Newlin </span>
              <o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
            <div style="border:none;border-top:solid #B5C4DF
              1.0pt;padding:3.0pt 0in 0in 0in">
              <p class="MsoNormal"><b><span style="color:black">From: </span></b><span
                  style="color:black">Bogdan-Andrei Iancu
                  <a href="mailto:bogdan@opensips.org"
                    moz-do-not-send="true"><bogdan@opensips.org></a><br>
                  <b>Date: </b>Friday, October 26, 2018 at 3:06 AM<br>
                  <b>To: </b>Ben Newlin <a
                    href="mailto:Ben.Newlin@genesys.com"
                    moz-do-not-send="true"><Ben.Newlin@genesys.com></a>,
                  OpenSIPS users mailling list
                  <a href="mailto:users@lists.opensips.org"
                    moz-do-not-send="true"><users@lists.opensips.org></a><br>
                  <b>Subject: </b>Re: [OpenSIPS-Users] CPU 100% with
                  TCP</span><o:p></o:p></p>
            </div>
            <div>
              <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
            </div>
            <p class="MsoNormal"><tt><span style="font-size:10.0pt">Hi
                  Ben,</span></tt><span
                style="font-size:10.0pt;font-family:"Courier New
                ,serif",serif"><br>
                <br>
              </span><tt><span style="font-size:10.0pt">Thank you for
                  the info.</span></tt><span
                style="font-size:10.0pt;font-family:"Courier New
                ,serif",serif"><br>
                <br>
              </span><tt><span style="font-size:10.0pt">It looks like
                  the processes get stuck into a HEP related internal
                  lock - do you see any HEP related errors in your logs,
                  prior to the dead-lock ?</span></tt><span
                style="font-size:10.0pt;font-family:"Courier New
                ,serif",serif"><br>
                <br>
              </span><tt><span style="font-size:10.0pt">Also, as PoC,
                  could you disabled HEP tracing to see if the problem
                  goes away ?</span></tt><span
                style="font-size:10.0pt;font-family:"Courier New
                ,serif",serif"><br>
                <br>
              </span><tt><span style="font-size:10.0pt">Thanks,</span></tt><span
                style="font-size:10.0pt;font-family:"Courier New
                ,serif",serif"><br>
                <br>
                <br>
                <br>
                <br>
              </span><o:p></o:p></p>
            <pre>Bogdan-Andrei Iancu<o:p></o:p></pre>
            <pre> <o:p></o:p></pre>
            <pre>OpenSIPS Founder and Developer<o:p></o:p></pre>
            <pre>  <a href="http://www.opensips-solutions.com" moz-do-not-send="true">http://www.opensips-solutions.com</a><o:p></o:p></pre>
            <pre>OpenSIPS Bootcamp 2018<o:p></o:p></pre>
            <pre>  <a href="http://opensips.org/training/OpenSIPS_Bootcamp_2018/" moz-do-not-send="true">http://opensips.org/training/OpenSIPS_Bootcamp_2018/</a><o:p></o:p></pre>
            <div>
              <p class="MsoNormal">On 10/24/2018 10:18 PM, Ben Newlin
                wrote:<o:p></o:p></p>
            </div>
            <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
              <p class="MsoNormal"><span style="font-size:11.0pt">Bogdan,</span><o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt">I have
                  run the command but the output was too large for
                  pastebin so I have sent it to you directly.</span><o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
              <p class="MsoNormal"><span
                  style="font-size:11.0pt;color:black">Ben Newlin </span>
                <o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
              <div style="border:none;border-top:solid #B5C4DF
                1.0pt;padding:3.0pt 0in 0in 0in">
                <p class="MsoNormal"><b><span style="color:black">From:
                    </span></b><span style="color:black">Bogdan-Andrei
                    Iancu
                    <a href="mailto:bogdan@opensips.org"
                      moz-do-not-send="true"><bogdan@opensips.org></a><br>
                    <b>Date: </b>Wednesday, October 24, 2018 at 5:17 AM<br>
                    <b>To: </b>OpenSIPS users mailling list <a
                      href="mailto:users@lists.opensips.org"
                      moz-do-not-send="true">
                      <users@lists.opensips.org></a>, Ben Newlin <a
                      href="mailto:Ben.Newlin@genesys.com"
                      moz-do-not-send="true">
                      <Ben.Newlin@genesys.com></a><br>
                    <b>Subject: </b>Re: [OpenSIPS-Users] CPU 100% with
                    TCP</span><o:p></o:p></p>
              </div>
              <div>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
              </div>
              <p class="MsoNormal"><tt><span style="font-size:10.0pt">Hi
                    Ben,</span></tt><span
                  style="font-size:10.0pt;font-family:"Courier New
                  ,serif",serif"><br>
                  <br>
                </span><tt><span style="font-size:10.0pt">Could you run
                    "opensipsctl trap" ?</span></tt><span
                  style="font-size:10.0pt;font-family:"Courier New
                  ,serif",serif"><br>
                  <br>
                </span><tt><span style="font-size:10.0pt">Regards,</span></tt><span
                  style="font-size:10.0pt;font-family:"Courier New
                  ,serif",serif"><br>
                  <br>
                  <br>
                  <br>
                  <br>
                </span><o:p></o:p></p>
              <pre>Bogdan-Andrei Iancu<o:p></o:p></pre>
              <pre> <o:p></o:p></pre>
              <pre>OpenSIPS Founder and Developer<o:p></o:p></pre>
              <pre>  <a href="http://www.opensips-solutions.com" moz-do-not-send="true">http://www.opensips-solutions.com</a><o:p></o:p></pre>
              <pre>OpenSIPS Bootcamp 2018<o:p></o:p></pre>
              <pre>  <a href="http://opensips.org/training/OpenSIPS_Bootcamp_2018/" moz-do-not-send="true">http://opensips.org/training/OpenSIPS_Bootcamp_2018/</a><o:p></o:p></pre>
              <div>
                <p class="MsoNormal">On 10/24/2018 12:56 AM, Ben Newlin
                  wrote:<o:p></o:p></p>
              </div>
              <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
                <p class="MsoNormal"><span style="font-size:11.0pt">Hi,</span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt">We
                    have implemented TCP recently and are performing
                    TCP<->UDP translation on one of our proxy
                    types. This proxy only exists for that purpose;
                    there are no DB queries, REST calls, or anything
                    like that. It is designed to be very fast and high
                    throughput.</span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt">Recently
                    we have found that when the remote endpoint of a TCP
                    connection is lost, i.e. the server goes down, while
                    under moderate load OpenSIPS quickly reaches 100%
                    CPU and becomes unresponsive. When this occurs, the
                    “top” command shows that between 30-90% CPU is in
                    System (kernel) space, and each OpenSIPS TCP process
                    shows many times the normal CPU. We are running
                    OpenSIPS 2.4.2 on Amazon Linux.</span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt">I
                    obtained as much information as I could using ps,
                    strace, and gdb here:
                    <a href="https://pastebin.com/JP3DnCqs"
                      moz-do-not-send="true">
                      https://pastebin.com/JP3DnCqs</a>. We can
                    reproduce the failure consistently by removing a
                    server during call traffic.</span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt">A
                    few things I noticed:</span><o:p></o:p></p>
                <ul style="margin-top:0in" type="disc">
                  <li class="MsoListParagraph"
                    style="margin-left:0in;mso-list:l0 level1 lfo3"><span
                      style="font-size:11.0pt">The number of running
                      threads reported by OpenSIPS doesn’t align with
                      our configuration, copied here:</span>
                    <o:p></o:p></li>
                </ul>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">####### Global Parameters
                    #########</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">children=32</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">#// Allow 503 to pass back to
                    Control</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">disable_503_translation=yes</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">#// Even though we are not
                    receiving HEP,</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">#// this listener is required by
                    OpenSIPS</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">#// in order to use the
                    proto_hep module. :/</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">listen=hep_tcp:10.32.40.245:9061
                    use_children 1</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">#// Configure the listeners</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">listen=udp:10.32.40.245:5060 as
                    XXX.XXX.XXX.XXX</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">listen=tcp:10.32.40.245:5060 as
                    XXX.XXX.XXX.XXX</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">#// Transaction Module</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">loadmodule "tm.so"</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">modparam("tm",
                    "restart_fr_on_each_reply", 0)</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">modparam("tm",
                    "timer_partitions", 8)</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">modparam("tm",
                    "onreply_avp_mode", 1)</span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:8.0pt;font-family:"Courier New
                    ,serif",serif">modparam("tm", "wt_timer", 10)</span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <p class="MsoListParagraph"><span
                    style="font-size:11.0pt">According to the
                    documentation if “tcp_children” is not set then the
                    value of “children” will be used [1], but we have
                    set “children” to 32 and only have the default 8 TCP
                    processes. Also we appear to only have 1 timer
                    process, although we have set the number of timer
                    partitions to 8.</span><o:p></o:p></p>
                <ul style="margin-top:0in" type="disc">
                  <li class="MsoListParagraph"
                    style="margin-left:0in;mso-list:l0 level1 lfo3"><span
                      style="font-size:11.0pt">The server that is
                      terminated was using TCP connections exclusively,
                      but all of the CPU seems to be in the UDP threads.
                      The one I looked at appeared to be handling a
                      CANCEL to one of the calls that was active and was
                      attempting to send it out via TCP. I’m not sure
                      why it would be trying to relay the CANCEL as no
                      100 Trying had been received from the server. I
                      have noticed that in 2.x OpenSIPS will now send
                      CANCELs for transactions even when 100 Trying was
                      not received. Is that intentional? RFC 3261 states
                      that no CANCEL should be sent unless a provisional
                      response has been received.</span>
                    <o:p></o:p></li>
                </ul>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt">Any
                    assistance with this would be appreciated.</span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt">[1]
                    - <a
href="http://www.opensips.org/Documentation/Script-CoreParameters-2-4#toc66"
                      moz-do-not-send="true">
http://www.opensips.org/Documentation/Script-CoreParameters-2-4#toc66</a></span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:11.0pt;color:black">Ben Newlin </span>
                  <o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt"><br>
                    <br>
                    <br>
                    <br>
                    <br>
                    <br>
                  </span><o:p></o:p></p>
                <pre>_______________________________________________<o:p></o:p></pre>
                <pre>Users mailing list<o:p></o:p></pre>
                <pre><a href="mailto:Users@lists.opensips.org" moz-do-not-send="true">Users@lists.opensips.org</a><o:p></o:p></pre>
                <pre><a href="http://lists.opensips.org/cgi-bin/mailman/listinfo/users" moz-do-not-send="true">http://lists.opensips.org/cgi-bin/mailman/listinfo/users</a><o:p></o:p></pre>
              </blockquote>
              <p class="MsoNormal"><span style="font-size:11.0pt"><br>
                  <br>
                  <br>
                  <br>
                  <br>
                </span><o:p></o:p></p>
            </blockquote>
            <p class="MsoNormal"><span style="font-size:11.0pt"><br>
                <br>
                <br>
                <br>
              </span><o:p></o:p></p>
          </blockquote>
          <p class="MsoNormal"><span style="font-size:11.0pt"><br>
              <br>
              <br>
            </span><o:p></o:p></p>
        </blockquote>
        <p class="MsoNormal"><span style="font-size:11.0pt"><br>
            <br>
            <o:p></o:p></span></p>
      </div>
    </blockquote>
    <br>
  </body>
</html>