[OpenSIPS-Users] Mediaproxy hanging sessions on high load

Daniel Zanutti daniel.zanutti at gmail.com
Thu Mar 16 21:54:29 EDT 2017


Adrian

You may be correct, overload can be the problem. But since the call is
already finished, how can I remove from the relay? The final problem is the
relay processing freezing, i need to avoid this.

Thanks

On Thu, Mar 16, 2017 at 10:40 PM, Adrian Georgescu <ag at ag-projects.com>
wrote:

> Perhaps your virtual machine simply cannot handle the load. The commands
> to close sessions may also be dropped or lost under such environment.
>
> Adrian
>
>
>
> On 16 Mar 2017, at 11:22, Daniel Zanutti <daniel.zanutti at gmail.com> wrote:
>
> Hi Dan
>
> Looks like this problem is only happening on virtual machines, not on
> physical machines. And only while they are on high load.
>
> But i'm not sure about the kernel rule, is there any way to check it?
>
> Please take a look at this case, this Relay will never halt because there
> are more than 3k sessions that will never finish internally (the call has
> already hangup hours ago):
>
> 8 2.2.2.2 2.6.1 44h01'05"
> 112.03kbps 3045
> audio 3045 Halting
>
> Some of these calls:
>
>
>
>
>
>
>
>
>
>
>
>
> 728 *From:* 222222 at 4.4.4.4
> *To:* 33333333 at sip.aaa.com.br
> [image: unknown agent] [image: HG4000/1.0] 6.6.6.6:55632 2.2.2.2:46640
> 2.2.2.2:46866 7.7.7.7:4170 active G729 audio 21h35'34" 0 0
> 729 *From:* 2222222 at 4.4.4.4:5064
> *To:* 33333333 at sip.aaa.com.br
> [image: TS-v4.6.0-11eW] [image: Agitel GSM Bridge v2.0] 6.6.6.6:34908
> 2.2.2.2:58158 2.2.2.2:54372 7.7.7.7:16846 active G729 audio 16h11'51" 0 0
> 730 *From:* 22222222 at 4.4.4.4
> *To:* 33333333 at sip.aaa.com.br
> [image: Mediant 2000/v.6.60A.328.003] [image: unknown agent] 6.6.6.6:46324
> 2.2.2.2:50156 2.2.2.2:48182 7.7.7.7:18516 active G729 audio 19h45'38" 0 0
> 731 *From:* 222222 at 4.4.4.4:5061
> *To:* 33333333333 at sip.aaa.com.br
> [image: TS-v4.6.0-14b] [image: gsm-gw-3.4.1] 6.6.6.6:54800 2.2.2.2:43998
> 2.2.2.2:46144 7.7.7.7:12360 active G729 audio 19h09'41" 0 0
> 732 *From:* 2222222 at 4.4.4.4
> *To:* 333333333333 at sip.aaa.com.br
> [image: Trinit IVR] [image: HG4000/1.0] 6.6.6.6:18854 2.2.2.2:51924
> 2.2.2.2:40512 7.7.7.7:4200 active G729 audio 19h37'59" 0 0
>
> Is there any way to drop these sessions? Maybe using the internal timeout
> system of mediaproxy?
>
> If you could take a look personally, we could negotiate an hourly rate.
>
> Thanks again
>
>
>
> On Thu, Mar 16, 2017 at 10:54 AM, Dan Pascu <dan at ag-projects.com> wrote:
>
>>
>> One thing came to mind. A case when the relay could get overloaded is if
>> a lot of clients start sessions and only one endpoint sends media. That is
>> the only case where the relay would have to deal with the media traffic
>> itself and having hundreds of such sessions at the same time could overload
>> the relay.
>>
>> The way the relay works is for each call it starts listening on 4 ports
>> (2 for RTP and 2 for RTCP). Each endpoint will send 2 streams (1 RTP one
>> RTCP) and initially the relay will just listen on these ports and when it
>> receives data it learns the endpoint's address. After it learns both
>> endpoint's addresses, it adds a conntrack rule in the kernel to allow the
>> kernel to directly relay the media streams between the endpoints and it
>> will never see a media packet from the endpoints again until the call ends.
>> This allows for very efficient data forwarding because it's done entirely
>> in the kernel with no data being transferred from kernel to user-space and
>> back like traditional solutions. We have seen media relays handling
>> hundreds of calls at a time with 0% CPU load on the relay.
>>
>> So the only thing I can think of causing something like what you describe
>> (even though I'm still not sure what you meant by hanging up sessions), is
>> that somehow this process didn't finish setting up completely and the relay
>> directly receives media streams from hundreds of devices because only one
>> endpoint sends data (or the other endpoint's data gets filtered at some
>> firewall), and because it cannot learn both endpoint's addresses it cannot
>> setup the kernel conntrack rule to move data forwarding to the kernel.
>>
>> On 14 Mar 2017, at 13:38, Dan Pascu wrote:
>>
>> >
>> > On 13 Mar 2017, at 18:58, Daniel Zanutti wrote:
>> >
>> >> Hi guys
>> >>
>> >> I sent this email a few days ago, anyone from Mediaproxy team could
>> take a look? I could debug it, just need some directions on where to look.
>> >
>> > We have never encountered this problem, so I', not sure what to
>> suggest, even more considering that the description is not very clear. What
>> do you mean when you say the relay starts to hang some sessions? That it
>> timeouts on them not having traffic and initiates a BYE for those sessions?
>> Because in the next paragraph you imply that they never timeout.
>> >
>> >>
>> >> Thanks
>> >>
>> >> On Tue, Mar 7, 2017 at 11:10 AM, Daniel Zanutti <
>> daniel.zanutti at gmail.com> wrote:
>> >> I'm using mediaproxy on several installations and have noticed that
>> when the machine is on high load (> 700 sessions), the media-relay process
>> starts to hang some sessions.
>> >>
>> >> These sessions doesn't have any RTP being sent/received anymore and
>> they never hangup. After some hours of frozen sessions, the media-relay
>> process doesn't connect to the dispatcher anymore, but keep using high CPU
>> on the machine. Maybe it's on loop internally, not sure.
>> >>
>> >> Is there any solution for this? Maybe a timer to cleanup old sessions
>> (2 or 4+ hours old).
>> >>
>> >> Thanks
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> Users mailing list
>> >> Users at lists.opensips.org
>> >> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>> >
>> >
>> > --
>> > Dan
>> >
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > Users mailing list
>> > Users at lists.opensips.org
>> > http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>
>>
>> --
>> Dan
>>
>>
>>
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opensips.org
>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20170316/f5568a95/attachment-0001.html>


More information about the Users mailing list