[OpenSIPS-Users] Autoscaler in 3.2.x

Yury Kirsanov y.kirsanov at gmail.com
Wed Sep 14 19:13:42 UTC 2022


Hi Bogdan,
Thanks a lot for your help and support! The only question I know have is
why OpenSIPS was going into a crash if all TCP processes were blocked
waiting for connection? It was starting to consume more and more memory and
then it was crashing with a segfault upon reaching then -m memory
parameter. I do understand that TCP listeners were in a blocking mode and
were not able to do any work until the session could be fully established,
not being able to forward any SIP packets, but isn't that a bug that
OpenSIPS was starting to eat memory and then crash? Do I need to open a bug
report on this? Thanks!

Best regards,
Yury.

On Wed, Sep 14, 2022 at 10:58 PM Bogdan-Andrei Iancu <bogdan at opensips.org>
wrote:

> Hi Yury,
>
> You need to check the TCP setting and to be sure your OpenSIPS will (1)
> not try to perform TCP connect against destination known not to be able to
> accept (like TCP/WS end points behind NAT) - see the tcp_no_new_conn_bflag
> [1] - or (2) not block for long time while attempting a connect - see the
> tcp_connect_timeout [2] or consider enabling async [3].
>
> [1]
> https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_no_new_conn_bflag
> [2]
> https://www.opensips.org/Documentation/Script-CoreParameters-3-2#tcp_connect_timeout
> [3] https://opensips.org/html/docs/modules/3.2.x/proto_tcp.html#idp168992
>
> Regards,
>
> Bogdan-Andrei Iancu
>
> OpenSIPS Founder and Developer
>   https://www.opensips-solutions.com
> OpenSIPS Summit 27-30 Sept 2022, Athens
>   https://www.opensips.org/events/Summit-2022Athens/
>
> On 9/13/22 12:01 PM, Yury Kirsanov wrote:
>
> Hi Bogdan,
> Thanks for this update, but it looks like I can't check autoscaler because
> of this first issue with blocking TCP connect. Is there a way to resolve
> it? Am I doing something wrong? Or is that something to do with OpenSIPS
> code? As yes, you're right, as soon as I restart OpenSIPS having a lot of
> SIP devices trying to connect to it - it goes crazy, starts to consume
> memory and stops to forward packets sitting there at 100% load until it
> runs out of memory and segfaults. Sometimes I can't even restart it to come
> to normal state to make it work, it just loops into same crash whatever I
> try to do.
>
> I've compiled OpenSIPS 3.3.1 with your patch and was able to start it but
> not sure, maybe I was just lucky this time.
>
> What should I do? Thanks!
>
> Best regards,
> Yury.
>
> On Tue, 13 Sept 2022, 18:56 Bogdan-Andrei Iancu, <bogdan at opensips.org>
> wrote:
>
>> Hi Yury,
>>
>> it looks like you some multiple issues, overlapping here. The traps you
>> sent here have nothing to do with the auto-scaling, but with a blocking TCP
>> connect for SIP - most of the procs get blocked into a sync TCP connect.
>>
>> Regards,
>>
>> Bogdan-Andrei Iancu
>>
>> OpenSIPS Founder and Developer
>>   https://www.opensips-solutions.com
>> OpenSIPS Summit 27-30 Sept 2022, Athens
>>   https://www.opensips.org/events/Summit-2022Athens/
>>
>> On 9/12/22 4:39 PM, Yury Kirsanov wrote:
>>
>> Hi Bogdan,
>> I've applied the patch (had to find where to apply it manually for 3.2.8
>> downloaded from Web page, line 1568 instead of 1652) and restarted the
>> server with only about 300-350 SIP devices and immediately got into same
>> issue. I'm attaching two GDB dumps made within several minutes from each
>> other. Autoscale was now OFF, please see my previous message as currently
>> for some reason I'm experiencing lockups even when it's off :(
>>
>>
>> Best regards,
>> Yury.
>>
>> On Mon, Sep 12, 2022 at 7:48 PM Bogdan-Andrei Iancu <bogdan at opensips.org>
>> wrote:
>>
>>> Hi Yuri,
>>>
>>> Could you give this patch a try? it should fix the blocking you
>>> experience (it should apply on 3.2 too).
>>>
>>> Best regards,
>>>
>>> Bogdan-Andrei Iancu
>>>
>>> OpenSIPS Founder and Developer
>>>   https://www.opensips-solutions.com
>>> OpenSIPS Summit 27-30 Sept 2022, Athens
>>>   https://www.opensips.org/events/Summit-2022Athens/
>>>
>>> On 9/7/22 2:54 PM, Bogdan-Andrei Iancu wrote:
>>>
>>> Hi Yury,
>>>
>>> Thanks for the details info here - let me do a review of some code and
>>> run some tests, as at this point I have a good idea on the direction to dig
>>> into.
>>>
>>> I will update here.
>>>
>>> Best regards,
>>>
>>> Bogdan-Andrei Iancu
>>>
>>> OpenSIPS Founder and Developer
>>>   https://www.opensips-solutions.com
>>> OpenSIPS Summit 27-30 Sept 2022, Athens
>>>   https://www.opensips.org/events/Summit-2022Athens/
>>>
>>> On 9/6/22 11:24 AM, Yury Kirsanov wrote:
>>>
>>> Hi Bogdan,
>>> Yes, I'm listening on all types of sockets including UDP, TCP and TLS on
>>> the outside public interface and then forward traffic into internal LAN via
>>> UDP only.
>>>
>>> Previously it was getting stuck quite easily, now I had to wait for a
>>> while before this actually happened. I've routed part of my customers to
>>> this server to obtain this result so I will have to do that again.
>>>
>>> As soon as I see one of the processes stuck I'll dot the trap command
>>> and send you all the details including processes load, ps output and so on.
>>>
>>> For now I had to switch autoscaling off and just create many listeners.
>>> Do I understand correctly that I need to restart OpenSIPS in order to apply
>>> autoscaling profiles and reload-routes is not sufficient?
>>>
>>> Also, do I need separate UDP profiles for public and private interfaces?
>>> And do I need to apply autoscaling profile just to a socket or I need to
>>> specify udp or tcp_workers with autoscaler too?
>>>
>>> Thanks and best regards,
>>> Yury.
>>>
>>> On Tue, 6 Sept 2022, 18:18 Bogdan-Andrei Iancu, <bogdan at opensips.org>
>>> wrote:
>>>
>>>> Hi Yury,
>>>>
>>>> Thanks for the info. I see that the stuck process (24) is an
>>>> auto-scalled one (based on its id). Do you have SIP traffic from UDP to TCP
>>>> or doing some HEP capturing for SIP ? I saw a recent similar report where a
>>>> UDP auto-scalled worked got stuck when trying to do some communication with
>>>> the TCP main/manager process (in order to handle a TCP operation).
>>>>
>>>> BTW, any chance to do a "opensips-cli -x trap" when you have that stuck
>>>> process, just to see where is it stuck? and is it hard to reproduce? as I
>>>> may ask you to extract some information from the running process....
>>>>
>>>> Regards,
>>>>
>>>> Bogdan-Andrei Iancu
>>>>
>>>> OpenSIPS Founder and Developer
>>>>   https://www.opensips-solutions.com
>>>> OpenSIPS Summit 27-30 Sept 2022, Athens
>>>>   https://www.opensips.org/events/Summit-2022Athens/
>>>>
>>>> On 9/3/22 6:54 PM, Yury Kirsanov wrote:
>>>>
>>>
>>>
>>> _______________________________________________
>>> Users mailing listUsers at lists.opensips.orghttp://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20220915/07714cfb/attachment.html>


More information about the Users mailing list