[OpenSIPS-Users] Autoscaler in 3.2.x
Bogdan-Andrei Iancu
bogdan at opensips.org
Thu Sep 15 06:01:30 UTC 2022
Hi Yury,
For the crash -> is there any core file to check ?
For mem usage -> you should try to get a memory dump for further
investigation [1].
[1] https://opensips.org/Documentation/TroubleShooting-OutOfMem
Best regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
https://www.opensips-solutions.com
OpenSIPS Summit 27-30 Sept 2022, Athens
https://www.opensips.org/events/Summit-2022Athens/
On 9/14/22 10:13 PM, Yury Kirsanov wrote:
> Hi Bogdan,
> Thanks a lot for your help and support! The only question I know have
> is why OpenSIPS was going into a crash if all TCP processes were
> blocked waiting for connection? It was starting to consume more and
> more memory and then it was crashing with a segfault upon reaching
> then -m memory parameter. I do understand that TCP listeners were in a
> blocking mode and were not able to do any work until the session could
> be fully established, not being able to forward any SIP packets, but
> isn't that a bug that OpenSIPS was starting to eat memory and then
> crash? Do I need to open a bug report on this? Thanks!
>
> Best regards,
> Yury.
>
> On Wed, Sep 14, 2022 at 10:58 PM Bogdan-Andrei Iancu
> <bogdan at opensips.org <mailto:bogdan at opensips.org>> wrote:
>
> Hi Yury,
>
> You need to check the TCP setting and to be sure your OpenSIPS
> will (1) not try to perform TCP connect against destination known
> not to be able to accept (like TCP/WS end points behind NAT) - see
> the tcp_no_new_conn_bflag [1] - or (2) not block for long time
> while attempting a connect - see the tcp_connect_timeout [2] or
> consider enabling async [3].
>
> [1]
> https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_no_new_conn_bflag
> <https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_no_new_conn_bflag>
> [2]
> https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_connect_timeout
> <https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_connect_timeout>
> [3]
> https://opensips.org/html/docs/modules/3.2.x/proto_tcp.html#idp168992
> <https://opensips.org/html/docs/modules/3.2.x/proto_tcp.html#idp168992>
>
> Regards,
>
> Bogdan-Andrei Iancu
>
> OpenSIPS Founder and Developer
> https://www.opensips-solutions.com <https://www.opensips-solutions.com>
> OpenSIPS Summit 27-30 Sept 2022, Athens
> https://www.opensips.org/events/Summit-2022Athens/ <https://www.opensips.org/events/Summit-2022Athens/>
>
> On 9/13/22 12:01 PM, Yury Kirsanov wrote:
>> Hi Bogdan,
>> Thanks for this update, but it looks like I can't check
>> autoscaler because of this first issue with blocking TCP connect.
>> Is there a way to resolve it? Am I doing something wrong? Or is
>> that something to do with OpenSIPS code? As yes, you're right, as
>> soon as I restart OpenSIPS having a lot of SIP devices trying to
>> connect to it - it goes crazy, starts to consume memory and stops
>> to forward packets sitting there at 100% load until it runs out
>> of memory and segfaults. Sometimes I can't even restart it to
>> come to normal state to make it work, it just loops into same
>> crash whatever I try to do.
>>
>> I've compiled OpenSIPS 3.3.1 with your patch and was able to
>> start it but not sure, maybe I was just lucky this time.
>>
>> What should I do? Thanks!
>>
>> Best regards,
>> Yury.
>>
>> On Tue, 13 Sept 2022, 18:56 Bogdan-Andrei Iancu,
>> <bogdan at opensips.org <mailto:bogdan at opensips.org>> wrote:
>>
>> Hi Yury,
>>
>> it looks like you some multiple issues, overlapping here. The
>> traps you sent here have nothing to do with the auto-scaling,
>> but with a blocking TCP connect for SIP - most of the procs
>> get blocked into a sync TCP connect.
>>
>> Regards,
>>
>> Bogdan-Andrei Iancu
>>
>> OpenSIPS Founder and Developer
>> https://www.opensips-solutions.com <https://www.opensips-solutions.com>
>> OpenSIPS Summit 27-30 Sept 2022, Athens
>> https://www.opensips.org/events/Summit-2022Athens/ <https://www.opensips.org/events/Summit-2022Athens/>
>>
>> On 9/12/22 4:39 PM, Yury Kirsanov wrote:
>>> Hi Bogdan,
>>> I've applied the patch (had to find where to apply it
>>> manually for 3.2.8 downloaded from Web page, line 1568
>>> instead of 1652) and restarted the server with only about
>>> 300-350 SIP devices and immediately got into same issue. I'm
>>> attaching two GDB dumps made within several minutes from
>>> each other. Autoscale was now OFF, please see my previous
>>> message as currently for some reason I'm experiencing
>>> lockups even when it's off :(
>>
>>> Best regards,
>>> Yury.
>>>
>>> On Mon, Sep 12, 2022 at 7:48 PM Bogdan-Andrei Iancu
>>> <bogdan at opensips.org <mailto:bogdan at opensips.org>> wrote:
>>>
>>> Hi Yuri,
>>>
>>> Could you give this patch a try? it should fix the
>>> blocking you experience (it should apply on 3.2 too).
>>>
>>> Best regards,
>>>
>>> Bogdan-Andrei Iancu
>>>
>>> OpenSIPS Founder and Developer
>>> https://www.opensips-solutions.com <https://www.opensips-solutions.com>
>>> OpenSIPS Summit 27-30 Sept 2022, Athens
>>> https://www.opensips.org/events/Summit-2022Athens/ <https://www.opensips.org/events/Summit-2022Athens/>
>>>
>>> On 9/7/22 2:54 PM, Bogdan-Andrei Iancu wrote:
>>>> Hi Yury,
>>>>
>>>> Thanks for the details info here - let me do a review
>>>> of some code and run some tests, as at this point I
>>>> have a good idea on the direction to dig into.
>>>>
>>>> I will update here.
>>>>
>>>> Best regards,
>>>> Bogdan-Andrei Iancu
>>>>
>>>> OpenSIPS Founder and Developer
>>>> https://www.opensips-solutions.com <https://www.opensips-solutions.com>
>>>> OpenSIPS Summit 27-30 Sept 2022, Athens
>>>> https://www.opensips.org/events/Summit-2022Athens/ <https://www.opensips.org/events/Summit-2022Athens/>
>>>> On 9/6/22 11:24 AM, Yury Kirsanov wrote:
>>>>> Hi Bogdan,
>>>>> Yes, I'm listening on all types of sockets including
>>>>> UDP, TCP and TLS on the outside public interface and
>>>>> then forward traffic into internal LAN via UDP only.
>>>>>
>>>>> Previously it was getting stuck quite easily, now I
>>>>> had to wait for a while before this actually happened.
>>>>> I've routed part of my customers to this server to
>>>>> obtain this result so I will have to do that again.
>>>>>
>>>>> As soon as I see one of the processes stuck I'll dot
>>>>> the trap command and send you all the details
>>>>> including processes load, ps output and so on.
>>>>>
>>>>> For now I had to switch autoscaling off and just
>>>>> create many listeners. Do I understand correctly that
>>>>> I need to restart OpenSIPS in order to apply
>>>>> autoscaling profiles and reload-routes is not sufficient?
>>>>>
>>>>> Also, do I need separate UDP profiles for public and
>>>>> private interfaces? And do I need to apply autoscaling
>>>>> profile just to a socket or I need to specify udp or
>>>>> tcp_workers with autoscaler too?
>>>>>
>>>>> Thanks and best regards,
>>>>> Yury.
>>>>>
>>>>> On Tue, 6 Sept 2022, 18:18 Bogdan-Andrei Iancu,
>>>>> <bogdan at opensips.org <mailto:bogdan at opensips.org>> wrote:
>>>>>
>>>>> Hi Yury,
>>>>>
>>>>> Thanks for the info. I see that the stuck process
>>>>> (24) is an auto-scalled one (based on its id). Do
>>>>> you have SIP traffic from UDP to TCP or doing some
>>>>> HEP capturing for SIP ? I saw a recent similar
>>>>> report where a UDP auto-scalled worked got stuck
>>>>> when trying to do some communication with the TCP
>>>>> main/manager process (in order to handle a TCP
>>>>> operation).
>>>>>
>>>>> BTW, any chance to do a "opensips-cli -x trap"
>>>>> when you have that stuck process, just to see
>>>>> where is it stuck? and is it hard to reproduce? as
>>>>> I may ask you to extract some information from the
>>>>> running process....
>>>>>
>>>>> Regards,
>>>>>
>>>>> Bogdan-Andrei Iancu
>>>>>
>>>>> OpenSIPS Founder and Developer
>>>>> https://www.opensips-solutions.com <https://www.opensips-solutions.com>
>>>>> OpenSIPS Summit 27-30 Sept 2022, Athens
>>>>> https://www.opensips.org/events/Summit-2022Athens/ <https://www.opensips.org/events/Summit-2022Athens/>
>>>>>
>>>>> On 9/3/22 6:54 PM, Yury Kirsanov wrote:
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Users mailing list
>>>> Users at lists.opensips.org <mailto:Users at lists.opensips.org>
>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users <http://lists.opensips.org/cgi-bin/mailman/listinfo/users>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20220915/4a16dc78/attachment-0001.html>
More information about the Users
mailing list