<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <tt>Ben,<br>
      <br>
      <tt>The dump you send is actually the backtrace from all the
        procs. What is the <tt>Holly <tt>Grail is the core<tt>file to
              be inspected via gdb<tt> (off list, of course).<br>
                <br>
                <tt>Thanks and regards,</tt></tt></tt></tt><tt><tt> </tt></tt></tt></tt><br>
    </tt>
    <pre class="moz-signature" cols="72">Bogdan-Andrei Iancu

OpenSIPS Founder and Developer
  <a class="moz-txt-link-freetext" href="http://www.opensips-solutions.com">http://www.opensips-solutions.com</a>
OpenSIPS Bootcamp 2018
  <a class="moz-txt-link-freetext" href="http://opensips.org/training/OpenSIPS_Bootcamp_2018/">http://opensips.org/training/OpenSIPS_Bootcamp_2018/</a>
</pre>
    <div class="moz-cite-prefix">On 11/13/2018 07:36 PM, Ben Newlin
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:FC000387-F7F1-427B-987D-E0A99A2A418A@genesys.com">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style><!--
/* Font Definitions */
@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"Apple Color Emoji";
        panose-1:0 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:#954F72;
        text-decoration:underline;}
pre
        {mso-style-priority:99;
        mso-style-link:"HTML Preformatted Char";
        margin:0in;
        margin-bottom:.0001pt;
        font-size:10.0pt;
        font-family:"Courier New";}
tt
        {mso-style-priority:99;
        font-family:"Courier New";}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0in;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:.5in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Calibri",sans-serif;}
p.msonormal0, li.msonormal0, div.msonormal0
        {mso-style-name:msonormal;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
span.HTMLPreformattedChar
        {mso-style-name:"HTML Preformatted Char";
        mso-style-priority:99;
        mso-style-link:"HTML Preformatted";
        font-family:Consolas;}
span.EmailStyle22
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;}
span.EmailStyle23
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;}
span.EmailStyle24
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;}
span.EmailStyle25
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;}
span.EmailStyle26
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;}
span.EmailStyle27
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:758982242;
        mso-list-type:hybrid;
        mso-list-template-ids:-1638617650 -1137403934 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
        {mso-level-start-at:0;
        mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Symbol;
        mso-fareast-font-family:Calibri;
        mso-bidi-font-family:"Times New Roman";}
@list l0:level2
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:"Courier New";}
@list l0:level3
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Wingdings;}
@list l0:level4
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Symbol;}
@list l0:level5
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:"Courier New";}
@list l0:level6
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Wingdings;}
@list l0:level7
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Symbol;}
@list l0:level8
        {mso-level-number-format:bullet;
        mso-level-text:o;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:"Courier New";}
@list l0:level9
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:Wingdings;}
@list l1
        {mso-list-id:1470592577;
        mso-list-template-ids:-654276558;}
@list l1:level1
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level2
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:1.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level3
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:1.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level4
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:2.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level5
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:2.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level6
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:3.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level7
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:3.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level8
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:4.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l1:level9
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:4.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2
        {mso-list-id:1794712114;
        mso-list-template-ids:916767856;}
@list l2:level1
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level2
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:1.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level3
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:1.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level4
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:2.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level5
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:2.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level6
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:3.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level7
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:3.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level8
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:4.0in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
@list l2:level9
        {mso-level-number-format:bullet;
        mso-level-text:;
        mso-level-tab-stop:4.5in;
        mso-level-number-position:left;
        text-indent:-.25in;
        mso-ansi-font-size:10.0pt;
        font-family:Symbol;}
ol
        {margin-bottom:0in;}
ul
        {margin-bottom:0in;}
--></style>
      <div class="WordSection1">
        <p class="MsoNormal"><span style="font-size:11.0pt">Bogdan,<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span style="font-size:11.0pt">Can you
            clarify if you’re saying you need more information beyond
            the dumps I’ve just provided to you off-list?<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span style="font-size:11.0pt;color:black">Ben
            Newlin </span>
          <span style="font-size:11.0pt"><o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
        <div style="border:none;border-top:solid #B5C4DF
          1.0pt;padding:3.0pt 0in 0in 0in">
          <p class="MsoNormal"><b><span style="color:black">From: </span></b><span
              style="color:black">Bogdan-Andrei Iancu
              <a class="moz-txt-link-rfc2396E" href="mailto:bogdan@opensips.org"><bogdan@opensips.org></a><br>
              <b>Date: </b>Tuesday, November 13, 2018 at 11:11 AM<br>
              <b>To: </b>Ben Newlin <a class="moz-txt-link-rfc2396E" href="mailto:Ben.Newlin@genesys.com"><Ben.Newlin@genesys.com></a>,
              OpenSIPS users mailling list
              <a class="moz-txt-link-rfc2396E" href="mailto:users@lists.opensips.org"><users@lists.opensips.org></a><br>
              <b>Subject: </b>Re: [OpenSIPS-Users] CPU 100% with TCP<o:p></o:p></span></p>
        </div>
        <div>
          <p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>
        </div>
        <p class="MsoNormal"><tt><span style="font-size:10.0pt">Hi Ben,</span></tt><span
            style="font-size:10.0pt;font-family:"Courier New""><br>
            <br>
            <tt>Sorry for not being able to answer you before sending
              the new set of BTs. Indeed, getting the corefile of only
              one process will do it as the locks (and debug info) are
              in the shared memory. So, the deadlock happens again, do
              the "opensipsctl trap" and get the corefile of one process
              (ideally an UDP worker - get its pid via "opensipsctl fifo
              ps").</tt><br>
            <tt>Keep the core as we will have to dig into it together
              :).</tt><br>
            <br>
            <tt>Many thanks,</tt><br>
            <br>
            <br>
          </span><o:p></o:p></p>
        <pre>Bogdan-Andrei Iancu<o:p></o:p></pre>
        <pre><o:p> </o:p></pre>
        <pre>OpenSIPS Founder and Developer<o:p></o:p></pre>
        <pre>  <a href="http://www.opensips-solutions.com" moz-do-not-send="true">http://www.opensips-solutions.com</a><o:p></o:p></pre>
        <pre>OpenSIPS Bootcamp 2018<o:p></o:p></pre>
        <pre>  <a href="http://opensips.org/training/OpenSIPS_Bootcamp_2018/" moz-do-not-send="true">http://opensips.org/training/OpenSIPS_Bootcamp_2018/</a><o:p></o:p></pre>
        <div>
          <p class="MsoNormal">On 11/06/2018 10:14 PM, Ben Newlin wrote:<o:p></o:p></p>
        </div>
        <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
          <p class="MsoNormal"><span style="font-size:11.0pt">Bogdan,</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:11.0pt">I am
              trying to obtain this information for you but I am having
              trouble getting the core files. Is it really necessary to
              kill every opensips process? This generates almost 40 core
              files and each is quite large (~1GB). I simply don’t have
              that disk space currently. I can make a change to get more
              but it is slowing the process. Would it be sufficient to
              get just one core file?</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:11.0pt">Also,
              runtime inspection with gdb is possible in this case if
              you can provide me with the commands you would want to
              see. I would need very specific commands as I am not very
              familiar with gdb.</span><o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
          <p class="MsoNormal"><span
              style="font-size:11.0pt;color:black">Ben Newlin </span>
            <o:p></o:p></p>
          <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
          <div style="border:none;border-top:solid #B5C4DF
            1.0pt;padding:3.0pt 0in 0in 0in">
            <p class="MsoNormal"><b><span style="color:black">From: </span></b><span
                style="color:black">Bogdan-Andrei Iancu
                <a href="mailto:bogdan@opensips.org"
                  moz-do-not-send="true"><bogdan@opensips.org></a><br>
                <b>Date: </b>Thursday, November 1, 2018 at 1:29 PM<br>
                <b>To: </b>Ben Newlin <a
                  href="mailto:Ben.Newlin@genesys.com"
                  moz-do-not-send="true"><Ben.Newlin@genesys.com></a>,
                OpenSIPS users mailling list
                <a href="mailto:users@lists.opensips.org"
                  moz-do-not-send="true"><users@lists.opensips.org></a><br>
                <b>Subject: </b>Re: [OpenSIPS-Users] CPU 100% with TCP</span><o:p></o:p></p>
          </div>
          <div>
            <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
          </div>
          <p class="MsoNormal"><tt><span style="font-size:10.0pt">Hi
                Ben,</span></tt><span
              style="font-size:10.0pt;font-family:"Courier
              New""><br>
              <br>
              <tt>First be sure you have the DBG_LOCK option compiled
                in. Do the "opensips -V" and see the output flags.</tt><br>
              <br>
              <tt>Next step will be to force an SIGSEGV to opensips
                (killall -11 opensips) when the deadlock occurs - I need
                a core file to inspect (assuming that runtime inspection
                with gdb is not possible).</tt><br>
              <br>
              <tt>Regards,</tt><br>
              <br>
              <br>
            </span><o:p></o:p></p>
          <pre>Bogdan-Andrei Iancu<o:p></o:p></pre>
          <pre> <o:p></o:p></pre>
          <pre>OpenSIPS Founder and Developer<o:p></o:p></pre>
          <pre>  <a href="http://www.opensips-solutions.com" moz-do-not-send="true">http://www.opensips-solutions.com</a><o:p></o:p></pre>
          <pre>OpenSIPS Bootcamp 2018<o:p></o:p></pre>
          <pre>  <a href="http://opensips.org/training/OpenSIPS_Bootcamp_2018/" moz-do-not-send="true">http://opensips.org/training/OpenSIPS_Bootcamp_2018/</a><o:p></o:p></pre>
          <div>
            <p class="MsoNormal">On 10/31/2018 09:07 PM, Ben Newlin
              wrote:<o:p></o:p></p>
          </div>
          <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
            <p class="MsoNormal"><span style="font-size:11.0pt">Bogdan,</span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt">For the
                first test I have done as you suggested and disabled
                only async operation for HEP, so it is still using TCP.
                I will send you the trap info directly as it is too
                large. I also compiled with the DBG_LOCK option, but am
                unsure whether that extra information will be available
                in the trap output or do you need something else?</span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt">I am now
                going to switch HEP to use UDP to mirror our production
                environment and try to reproduce again. Wish me luck!
              </span><span
                style="font-size:11.0pt;font-family:"Apple Color
                Emoji"">☺</span><o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
            <p class="MsoNormal"><span
                style="font-size:11.0pt;color:black">Ben Newlin </span>
              <o:p></o:p></p>
            <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
            <div style="border:none;border-top:solid #B5C4DF
              1.0pt;padding:3.0pt 0in 0in 0in">
              <p class="MsoNormal"><b><span style="color:black">From: </span></b><span
                  style="color:black">Bogdan-Andrei Iancu
                  <a href="mailto:bogdan@opensips.org"
                    moz-do-not-send="true"><bogdan@opensips.org></a><br>
                  <b>Date: </b>Monday, October 29, 2018 at 2:19 PM<br>
                  <b>To: </b>Ben Newlin <a
                    href="mailto:Ben.Newlin@genesys.com"
                    moz-do-not-send="true"><Ben.Newlin@genesys.com></a>,
                  OpenSIPS users mailling list
                  <a href="mailto:users@lists.opensips.org"
                    moz-do-not-send="true"><users@lists.opensips.org></a><br>
                  <b>Subject: </b>Re: [OpenSIPS-Users] CPU 100% with
                  TCP</span><o:p></o:p></p>
            </div>
            <div>
              <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
            </div>
            <p class="MsoNormal">Hi Ben,<br>
              <br>
              I checked the error trace and it should not leave any
              dangling lock (due mishandled error). Before disabling
              HEP, try to disable the async support for HEP.<br>
              <br>
              If you claim that the same 100% CPU happens with HEP +
              UDP, send me a trap for that too, as in the previous case,
              the deadlock was exclusively HEP + TCP related.
              <br>
              <br>
              Anyhow, as the original trap showed a deadlock, next step
              will be to recompile with the DBG_LOCK option - this
              enables extra code to debug/troubleshoot locking related
              issues - are you able to do it?<br>
              <br>
              Regards,<br>
              <br>
              <br>
              <br>
              <o:p></o:p></p>
            <pre>Bogdan-Andrei Iancu<o:p></o:p></pre>
            <pre> <o:p></o:p></pre>
            <pre>OpenSIPS Founder and Developer<o:p></o:p></pre>
            <pre>  <a href="http://www.opensips-solutions.com" moz-do-not-send="true">http://www.opensips-solutions.com</a><o:p></o:p></pre>
            <pre>OpenSIPS Bootcamp 2018<o:p></o:p></pre>
            <pre>  <a href="http://opensips.org/training/OpenSIPS_Bootcamp_2018/" moz-do-not-send="true">http://opensips.org/training/OpenSIPS_Bootcamp_2018/</a><o:p></o:p></pre>
            <div>
              <p class="MsoNormal">On 10/26/2018 04:14 PM, Ben Newlin
                wrote:<o:p></o:p></p>
            </div>
            <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
              <p class="MsoNormal"><span style="font-size:11.0pt">Bogdan,</span><o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt">Actually,
                  yes we do. Looking back I can see these errors just
                  before the issue occurs:</span><o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
              <p class="MsoNormal"><span
                  style="font-size:10.0pt;font-family:"Courier
                  New"">Oct 24 19:00:36 [5700]
                  ERROR:proto_hep:send_hep_message: Cannot send hep
                  message!</span><o:p></o:p></p>
              <p class="MsoNormal"><span
                  style="font-size:10.0pt;font-family:"Courier
                  New"">Oct 24 19:00:36 [5700]
                  ERROR:proto_hep:msg_send: send() to 10.32.163.211:9061
                  for proto hep_tcp/9 failed</span><o:p></o:p></p>
              <p class="MsoNormal"><span
                  style="font-size:10.0pt;font-family:"Courier
                  New"">Oct 24 19:00:36 [5700]
                  ERROR:proto_hep:hep_tcp_send: failed to send</span><o:p></o:p></p>
              <p class="MsoNormal"><span
                  style="font-size:10.0pt;font-family:"Courier
                  New"">Oct 24 19:00:36 [5700]
                  ERROR:proto_hep:async_tsend_stream: Failed first TCP
                  async send : (32) Broken pipe</span><o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt">I will
                  try disabling HEP and see if we can reproduce.</span><o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt">Just
                  for information, I have been reproducing the issue in
                  our testing environment which uses TCP for HEP,
                  however the issue is occurring in our production
                  environment as well which is still using UDP for HEP.</span><o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
              <p class="MsoNormal"><span
                  style="font-size:11.0pt;color:black">Ben Newlin </span>
                <o:p></o:p></p>
              <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
              <div style="border:none;border-top:solid #B5C4DF
                1.0pt;padding:3.0pt 0in 0in 0in">
                <p class="MsoNormal"><b><span style="color:black">From:
                    </span></b><span style="color:black">Bogdan-Andrei
                    Iancu
                    <a href="mailto:bogdan@opensips.org"
                      moz-do-not-send="true"><bogdan@opensips.org></a><br>
                    <b>Date: </b>Friday, October 26, 2018 at 3:06 AM<br>
                    <b>To: </b>Ben Newlin <a
                      href="mailto:Ben.Newlin@genesys.com"
                      moz-do-not-send="true"><Ben.Newlin@genesys.com></a>,
                    OpenSIPS users mailling list
                    <a href="mailto:users@lists.opensips.org"
                      moz-do-not-send="true"><users@lists.opensips.org></a><br>
                    <b>Subject: </b>Re: [OpenSIPS-Users] CPU 100% with
                    TCP</span><o:p></o:p></p>
              </div>
              <div>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
              </div>
              <p class="MsoNormal"><tt><span style="font-size:10.0pt">Hi
                    Ben,</span></tt><span
                  style="font-size:10.0pt;font-family:"Courier
                  New""><br>
                  <br>
                  <tt>Thank you for the info.</tt><br>
                  <br>
                  <tt>It looks like the processes get stuck into a HEP
                    related internal lock - do you see any HEP related
                    errors in your logs, prior to the dead-lock ?</tt><br>
                  <br>
                  <tt>Also, as PoC, could you disabled HEP tracing to
                    see if the problem goes away ?</tt><br>
                  <br>
                  <tt>Thanks,</tt><br>
                  <br>
                  <br>
                  <br>
                  <br>
                  <br>
                </span><o:p></o:p></p>
              <pre>Bogdan-Andrei Iancu<o:p></o:p></pre>
              <pre> <o:p></o:p></pre>
              <pre>OpenSIPS Founder and Developer<o:p></o:p></pre>
              <pre>  <a href="http://www.opensips-solutions.com" moz-do-not-send="true">http://www.opensips-solutions.com</a><o:p></o:p></pre>
              <pre>OpenSIPS Bootcamp 2018<o:p></o:p></pre>
              <pre>  <a href="http://opensips.org/training/OpenSIPS_Bootcamp_2018/" moz-do-not-send="true">http://opensips.org/training/OpenSIPS_Bootcamp_2018/</a><o:p></o:p></pre>
              <div>
                <p class="MsoNormal">On 10/24/2018 10:18 PM, Ben Newlin
                  wrote:<o:p></o:p></p>
              </div>
              <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
                <p class="MsoNormal"><span style="font-size:11.0pt">Bogdan,</span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt">I
                    have run the command but the output was too large
                    for pastebin so I have sent it to you directly.</span><o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <p class="MsoNormal"><span
                    style="font-size:11.0pt;color:black">Ben Newlin </span>
                  <o:p></o:p></p>
                <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                <div style="border:none;border-top:solid #B5C4DF
                  1.0pt;padding:3.0pt 0in 0in 0in">
                  <p class="MsoNormal"><b><span style="color:black">From:
                      </span></b><span style="color:black">Bogdan-Andrei
                      Iancu
                      <a href="mailto:bogdan@opensips.org"
                        moz-do-not-send="true"><bogdan@opensips.org></a><br>
                      <b>Date: </b>Wednesday, October 24, 2018 at 5:17
                      AM<br>
                      <b>To: </b>OpenSIPS users mailling list <a
                        href="mailto:users@lists.opensips.org"
                        moz-do-not-send="true">
                        <users@lists.opensips.org></a>, Ben Newlin
                      <a href="mailto:Ben.Newlin@genesys.com"
                        moz-do-not-send="true">
                        <Ben.Newlin@genesys.com></a><br>
                      <b>Subject: </b>Re: [OpenSIPS-Users] CPU 100%
                      with TCP</span><o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                </div>
                <p class="MsoNormal"><tt><span style="font-size:10.0pt">Hi
                      Ben,</span></tt><span
                    style="font-size:10.0pt;font-family:"Courier
                    New""><br>
                    <br>
                    <tt>Could you run "opensipsctl trap" ?</tt><br>
                    <br>
                    <tt>Regards,</tt><br>
                    <br>
                    <br>
                    <br>
                    <br>
                    <br>
                  </span><o:p></o:p></p>
                <pre>Bogdan-Andrei Iancu<o:p></o:p></pre>
                <pre> <o:p></o:p></pre>
                <pre>OpenSIPS Founder and Developer<o:p></o:p></pre>
                <pre>  <a href="http://www.opensips-solutions.com" moz-do-not-send="true">http://www.opensips-solutions.com</a><o:p></o:p></pre>
                <pre>OpenSIPS Bootcamp 2018<o:p></o:p></pre>
                <pre>  <a href="http://opensips.org/training/OpenSIPS_Bootcamp_2018/" moz-do-not-send="true">http://opensips.org/training/OpenSIPS_Bootcamp_2018/</a><o:p></o:p></pre>
                <div>
                  <p class="MsoNormal">On 10/24/2018 12:56 AM, Ben
                    Newlin wrote:<o:p></o:p></p>
                </div>
                <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
                  <p class="MsoNormal"><span style="font-size:11.0pt">Hi,</span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt">We
                      have implemented TCP recently and are performing
                      TCP<->UDP translation on one of our proxy
                      types. This proxy only exists for that purpose;
                      there are no DB queries, REST calls, or anything
                      like that. It is designed to be very fast and high
                      throughput.</span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt">Recently
                      we have found that when the remote endpoint of a
                      TCP connection is lost, i.e. the server goes down,
                      while under moderate load OpenSIPS quickly reaches
                      100% CPU and becomes unresponsive. When this
                      occurs, the “top” command shows that between
                      30-90% CPU is in System (kernel) space, and each
                      OpenSIPS TCP process shows many times the normal
                      CPU. We are running OpenSIPS 2.4.2 on Amazon
                      Linux.</span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt">I
                      obtained as much information as I could using ps,
                      strace, and gdb here:
                      <a href="https://pastebin.com/JP3DnCqs"
                        moz-do-not-send="true">
                        https://pastebin.com/JP3DnCqs</a>. We can
                      reproduce the failure consistently by removing a
                      server during call traffic.</span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt">A
                      few things I noticed:</span><o:p></o:p></p>
                  <ul style="margin-top:0in" type="disc">
                    <li class="MsoListParagraph"
                      style="margin-left:0in;mso-list:l0 level1 lfo3"><span
                        style="font-size:11.0pt">The number of running
                        threads reported by OpenSIPS doesn’t align with
                        our configuration, copied here:</span>
                      <o:p></o:p></li>
                  </ul>
                  <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">####### Global Parameters #########</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New""> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">children=32</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New""> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">#// Allow 503 to pass back to Control</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">disable_503_translation=yes</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New""> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">#// Even though we are not receiving
                      HEP,</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">#// this listener is required by
                      OpenSIPS</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">#// in order to use the proto_hep
                      module. :/</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">listen=hep_tcp:10.32.40.245:9061
                      use_children 1</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New""> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">#// Configure the listeners</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">listen=udp:10.32.40.245:5060 as
                      XXX.XXX.XXX.XXX</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">listen=tcp:10.32.40.245:5060 as
                      XXX.XXX.XXX.XXX</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New""> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">#// Transaction Module</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">loadmodule "tm.so"</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">modparam("tm",
                      "restart_fr_on_each_reply", 0)</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">modparam("tm", "timer_partitions", 8)</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">modparam("tm", "onreply_avp_mode", 1)</span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:8.0pt;font-family:"Courier
                      New"">modparam("tm", "wt_timer", 10)</span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                  <p class="MsoListParagraph"><span
                      style="font-size:11.0pt">According to the
                      documentation if “tcp_children” is not set then
                      the value of “children” will be used [1], but we
                      have set “children” to 32 and only have the
                      default 8 TCP processes. Also we appear to only
                      have 1 timer process, although we have set the
                      number of timer partitions to 8.</span><o:p></o:p></p>
                  <ul style="margin-top:0in" type="disc">
                    <li class="MsoListParagraph"
                      style="margin-left:0in;mso-list:l0 level1 lfo3"><span
                        style="font-size:11.0pt">The server that is
                        terminated was using TCP connections
                        exclusively, but all of the CPU seems to be in
                        the UDP threads. The one I looked at appeared to
                        be handling a CANCEL to one of the calls that
                        was active and was attempting to send it out via
                        TCP. I’m not sure why it would be trying to
                        relay the CANCEL as no 100 Trying had been
                        received from the server. I have noticed that in
                        2.x OpenSIPS will now send CANCELs for
                        transactions even when 100 Trying was not
                        received. Is that intentional? RFC 3261 states
                        that no CANCEL should be sent unless a
                        provisional response has been received.</span>
                      <o:p></o:p></li>
                  </ul>
                  <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt">Any
                      assistance with this would be appreciated.</span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt">[1]
                      - <a
href="http://www.opensips.org/Documentation/Script-CoreParameters-2-4#toc66"
                        moz-do-not-send="true">
http://www.opensips.org/Documentation/Script-CoreParameters-2-4#toc66</a></span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt"> </span><o:p></o:p></p>
                  <p class="MsoNormal"><span
                      style="font-size:11.0pt;color:black">Ben Newlin </span>
                    <o:p></o:p></p>
                  <p class="MsoNormal"><span style="font-size:11.0pt"><br>
                      <br>
                      <br>
                      <br>
                      <br>
                      <br>
                      <br>
                    </span><o:p></o:p></p>
                  <pre>_______________________________________________<o:p></o:p></pre>
                  <pre>Users mailing list<o:p></o:p></pre>
                  <pre><a href="mailto:Users@lists.opensips.org" moz-do-not-send="true">Users@lists.opensips.org</a><o:p></o:p></pre>
                  <pre><a href="http://lists.opensips.org/cgi-bin/mailman/listinfo/users" moz-do-not-send="true">http://lists.opensips.org/cgi-bin/mailman/listinfo/users</a><o:p></o:p></pre>
                </blockquote>
                <p class="MsoNormal"><span style="font-size:11.0pt"><br>
                    <br>
                    <br>
                    <br>
                    <br>
                    <br>
                  </span><o:p></o:p></p>
              </blockquote>
              <p class="MsoNormal"><span style="font-size:11.0pt"><br>
                  <br>
                  <br>
                  <br>
                  <br>
                </span><o:p></o:p></p>
            </blockquote>
            <p class="MsoNormal"><span style="font-size:11.0pt"><br>
                <br>
                <br>
                <br>
              </span><o:p></o:p></p>
          </blockquote>
          <p class="MsoNormal"><span style="font-size:11.0pt"><br>
              <br>
              <br>
            </span><o:p></o:p></p>
        </blockquote>
        <p class="MsoNormal"><span style="font-size:11.0pt"><br>
            <br>
            <o:p></o:p></span></p>
      </div>
    </blockquote>
    <br>
  </body>
</html>