[OpenSIPS-Users] Memory Leak

Johan De Clercq Johan at democon.be
Wed Dec 4 15:20:13 EST 2019


Check rtcp mux demux.

On Wed, 4 Dec 2019, 20:22 Callum Guy, <callum.guy at x-on.co.uk> wrote:

> Thanks Johan, I've been through the rtpengine docs and source and can't
> see any way to disable the RTCP call stat reports. Having said that I'm no
> longer confident that RTPEngine is the problem - I'm losing memory
> constantly at a rate of about 130MB/hour at all times, even overnight when
> we are not seeing many calls. This seems to rule out the RTP reports or
> anything call load related.
>
> I now wonder if the issue is related to registrations and NAT pings (~3500
> registrations). This is the only constant activity on the system overnight
> (other than OpenSIPs clustering?). NAT pings are enabled for all contacts
> and many of the handsets are also pinging the server, according to client
> config.
>
> At this time I am not sure where to focus my attention - I've collected
> some data from my server (version, relevant module configs, memory dump) in
> the hope that someone might be able to spot a problem with my setup - you
> can find this at the following link:
>
>
> https://gist.githubusercontent.com/spacetourist/2b880c71eda54bcabf691282d6389209/raw/25c42b90ffa4e592cd761237dffca6f85aabbf4f/gistfile1.txt
>
> Please let me know if anything else would be useful. All ideas are very
> much appreciated, thanks.
>
> Callum
>
>
> On Tue, 3 Dec 2019 at 17:31, Johan De Clercq <Johan at democon.be> wrote:
>
>> I think you can. Check the documentation of rtpengine on github. And if
>> you can, please use the latest commit.
>>
>> On Tue, 3 Dec 2019, 18:02 Callum Guy, <callum.guy at x-on.co.uk> wrote:
>>
>>> Hi All,
>>>
>>> I'm working through this memory issue and have some additional data. The
>>> server crashed this afternoon due to memory exhaustion on a UDP listener
>>> process, as far as I know at this time. Note that the processes all have
>>> 4GB assigned so this is a gradual and constant growth issue.
>>>
>>> At this point I have my suspicions that it could be related to the
>>> RTPEngine module - the first memory allocation error messages are all
>>> related to what I believe the be RTCP session reports coming in from the
>>> RTP engine servers. These came through about every second for an hour
>>> before OpenSIPs finally gave up the ghost and restarted.
>>>
>>> Is this normal behaviour? We have only recently moved to RTPEngine from
>>> RTPProxy so I am new to this software. Does anyone know if it is possible
>>> to prevent RTPEngine from sending this data to OpenSIPs - its not something
>>> that we require and I'd like to check if this is at all related to the
>>> memory growth. Any other ideas would also be appreciated!
>>>
>>> Many thanks,
>>>
>>> Callum
>>>
>>> 2019-12-03T11:48:14.653225+00:00 TH-P-SIPREG-1 opensips[2521]:
>>> ERROR:core:fm_malloc: not enough free pkg memory (214628248 bytes left,
>>> need 536), please increase the "-M" command line parameter!
>>> 2019-12-03T11:48:14.653851+00:00 TH-P-SIPREG-1 opensips[2521]:
>>> INFO:core:fm_malloc: attempting defragmentation...
>>> 2019-12-03T11:48:15.651495+00:00 TH-P-SIPREG-1 opensips[2521]:
>>> INFO:core:fm_malloc: unable to alloc a big enough fragment!
>>> 2019-12-03T11:48:15.652218+00:00 TH-P-SIPREG-1 opensips[2521]:
>>> ERROR:rtpengine:rtpe_function_call: failed to decode bencoded reply from
>>> proxy: d7:createdi1575373650e10:created_usi505483e11:last
>>> signali1575373670e4:SSRCd10:3831331386d11:average
>>> MOSd3:MOSi44e15:round-trip timei6754e6:jitteri0e11:packet
>>> lossi0e7:samplesi9ee10:lowest MOSd3:MOSi44e15:round-trip
>>> timei6175e6:jitteri0e11:packet lossi0e11:reported ati1575373656ee11:highest
>>> MOSd3:MOSi44e15:round-trip timei6175e6:jitteri0e11:packet
>>> lossi0e11:reported ati1575373656ee15:MOS
>>> progressiond8:intervali3e7:entriesld3:MOSi44e15:round-trip
>>> timei6175e6:jitteri0e11:packet lossi0e11:reported
>>> ati1575373656eed3:MOSi44e15:round-trip timei6307e6:jitteri0e11:packet
>>> lossi0e11:reported ati1575373660eed3:MOSi44e15:round-trip
>>> timei6884e6:jitteri0e11:packet lossi0e11:reported
>>> ati1575373664eed3:MOSi44e15:round-trip timei7016e6:jitteri0e11:packet
>>> lossi0e11:reported ati1575373668eed3:MOSi44e15:round-trip
>>> timei6368e6:jitteri0e11:packet lossi0e11:reported
>>> ati1575373674eed3:MOSi44e15:round-trip timei6929e6:jitteri0e11:packet
>>> lossi0e11:reported ati1575373679eed3:MOSi44e15:round-trip
>>> timei6884e6:jitteri0e11:packet lossi0e11:reported
>>> ati1575373684eed3:MOSi44e15:round-trip timei7241e6:jitteri0e11:packet
>>> lossi0e11:reported ati1575373689eed3:MOSi44e15:round-trip
>>> timei6988e6:jitteri0e11:packet lossi0e11:reported
>>> ati1575373694eeeee10:1028099660d11:average MOSd3:MOSi43e15:round-trip
>>> timei29936e6:jitteri0e11:packet lossi0e7:samplesi7ee10:lowest
>>> MOSd3:MOSi43e15:round-trip timei29515e6:jitteri0e11:packet
>>> lossi0e11:reported ati1575373661ee11:highest MOSd3:MOSi43e15:round-trip
>>> timei29515e6:jitteri0e11:packet lossi0e11:reported ati1575373661ee15:MOS
>>> progressiond8:intervali3e7:entriesld3:MOSi43e15:round-trip
>>> timei29515e6:jitteri0e11:packet lossi0e11:reported
>>> ati1575373661eed3:MOSi43e15:round-trip timei29817e6:jitteri0e11:packet
>>> lossi0e11:reported ati1575373666eed3:MOSi43e15:round-trip
>>> timei30695e6:jitteri0e11:packet lossi0e11:reported
>>> ati1575373671eed3:MOSi43e15:round-trip timei29541e6:jitteri0e11:packet
>>> lossi0e11:reported ati1575373676eed3:MOSi43e15:round-trip
>>> timei30073e6:jitteri0e11:packet lossi0e11:reported
>>> ati1575373681eed3:MOSi43e15:round-trip timei29526e6:jitteri0e11:packet
>>> lossi0e11:reported ati1575373686eed3:MOSi43e15:round-trip
>>> timei30391e6:jitteri0e11:packet lossi0e11:reported
>>> ati1575373691eeeeee4:tagsd10:3044367218d3:tag10:30443672187:createdi1575373650e16:in
>>> dialogue
>>> with13:N8Nr5mDpcNt1a6:mediasld5:indexi1e4:type5:audio8:protocol7:RTP/AVP7:streamsld10:local
>>> porti32494e8:endpointd6:family4:IPv47:address13:51.51.51.514:porti47360ee19:advertised
>>> endpointd6:family4:IPv47:address9:10.0.0.154:porti12672ee11:last
>>> packeti1575373694e5:flagsl3:RTP6:filled9:confirmed10:kernelizede4:SSRCi1028099660e5:statsd7:packetsi2045e5:bytesi351740e6:errorsi0eeed10:local
>>> porti32495e8:endpointd6:family4:IPv47:address13:51.51.51.514:porti47361ee19:advertised
>>> endpointd6:family4:IPv47:address9:10.0.0.154:porti12673ee11:last
>>> packeti1575373691e5:flagsl4:RTCP6:filled9:confirmed10:kernelized17:no
>>> kernel
>>> supporte4:SSRCi1028099660e5:statsd7:packetsi8e5:bytesi1240e6:errorsi0eeee5:flagsl11:initialized4:send4:recveeee13:N8Nr5mDpcNt1ad3:tag13:N8Nr5mDpcNt1a7:createdi1575373650e16:in
>>> dialogue
>>> with10:30443672186:mediasld5:indexi1e4:type5:audio8:protocol7:RTP/AVP7:streamsld10:local
>>> porti34226e8:endpointd6:family4:IPv47:address15:192.168.163.2224:porti11484ee19:advertised
>>> endpointd6:family4:IPv47:address15:192.168.163.2224:porti11484ee11:last
>>> packeti1575373694e5:flagsl3:RTP6:filled9:confirmed10:kernelizede4:SSRCi3831331386e5:statsd7:packetsi2018e5:bytesi347096e6:errorsi0eeed10:local
>>> porti34227e8:endpointd6:family4:IPv47:address15:192.168.163.2224:porti11485ee19:advertised
>>> endpointd6:family4:IPv47:address15:192.168.163.2224:porti11485ee11:last
>>> packeti1575373694e5:flagsl4:RTCP6:filled9:confirmed10:kernelized17:no
>>> kernel
>>> supporte4:SSRCi3831331386e5:statsd7:packetsi9e5:bytesi1008e6:errorsi0eeee5:flagsl11:initialized4:send4:recv15:ICE
>>> controllingeeeee6:totalsd3:RTPd7:packetsi4063e5:bytesi698836e6:errorsi0ee4:RTCPd7:packetsi17e5:bytesi2248e6:errorsi0eee6:result2:oke
>>>
>>>
>>> On Sat, 30 Nov 2019 at 22:51, Callum Guy <callum.guy at x-on.co.uk> wrote:
>>>
>>>> Hi Ben,
>>>>
>>>> Thank you for your reply and insight here, very helpful to know you're
>>>> running a drastically different setting for the package memory.
>>>>
>>>> I presumed if it preallocated that I would have seen some issues during
>>>> testing, hence I've ended up with figures that were intended to provide 75%
>>>> of the system memory to the application.
>>>>
>>>> Memory usage had been creeping up all day at the time of writing
>>>> however migrations to this platform had been on hold since the initial
>>>> capture of memory usage although call traffic would have been relatively
>>>> even during daytime hours where the increase continued. On that basis I'm
>>>> still concerned that there is an issue with my config causing this growth
>>>> however I've now increased available memory and restarted so I should have
>>>> ample time to investigate this week, I'll report back any findings for the
>>>> community benefit. I will give some serious thought to lowering the package
>>>> allocation value once I've got to grips with the situation.
>>>>
>>>> Usefully this implementation shares a lot of common components to
>>>> another variant which acts as a pure proxy and does not deal with
>>>> registrations where I'm not seeing this issue so that will narrow down the
>>>> search area somewhat.
>>>>
>>>> Thanks again for your time,
>>>>
>>>> Callum
>>>>
>>>> On Sat, 30 Nov 2019, 15:46 Ben Newlin, <Ben.Newlin at genesys.com> wrote:
>>>>
>>>>> Callum,
>>>>>
>>>>>
>>>>>
>>>>> It’s my understanding that OpenSIPS does not release memory back to
>>>>> the OS, but it also pre-allocates all memory at startup into its private
>>>>> pool and then allocates from that internally. Normally shared memory should
>>>>> be significantly higher than package memory. For reference, on our system
>>>>> we run with “-m 1024 -M 64” and that is sufficient for us to process very
>>>>> high traffic volume. We don’t do registration though, so that may affect
>>>>> the sizes you need.
>>>>>
>>>>>
>>>>>
>>>>> You are setting your package memory size to 4G, so that will allocate
>>>>> 4G memory for every package (process) that loads and then 2G for shared
>>>>> memory. That will use up all the memory on your machine extremely quickly
>>>>> for sure. The statistics you provided seem like the memory increase is
>>>>> consistent with higher traffic levels on the second reading. You can see in
>>>>> your case that all of your “pkmem” processes have an extremely high amount
>>>>> of free memory (~3GB!). But that memory is still allocated from the OS, so
>>>>> you are instructing OpenSIPS to allocate much more than your system memory
>>>>> right at startup.
>>>>>
>>>>>
>>>>>
>>>>> Your shared memory also has just under 2GB free, so you have a lot of
>>>>> headroom there too. Since OpenSIPS pre-allocates, the amount of memory
>>>>> being used by the system overall should be fairly steady; if it is
>>>>> continuously increasing that implies a leak somewhere. IIRC there are a few
>>>>> processes/modules/commands in OpenSIPS or libraries it uses that do
>>>>> allocate memory directly from the system and not from OpenSIPS’ pool. You
>>>>> may need to investigate some of those to find out where your memory is
>>>>> going, or look at other processes/daemons you have running that could be
>>>>> using that memory.
>>>>>
>>>>>
>>>>>
>>>>> Ben Newlin
>>>>>
>>>>>
>>>>>
>>>>> *From: *Users <users-bounces at lists.opensips.org> on behalf of Callum
>>>>> Guy <callum.guy at x-on.co.uk>
>>>>> *Reply-To: *OpenSIPS users mailling list <users at lists.opensips.org>
>>>>> *Date: *Friday, November 29, 2019 at 10:57 AM
>>>>> *To: *OpenSIPS users mailling list <users at lists.opensips.org>
>>>>> *Subject: *[OpenSIPS-Users] Memory Leak - runtime flags?
>>>>>
>>>>>
>>>>>
>>>>> Hi All,
>>>>>
>>>>>
>>>>>
>>>>> I have recently deployed a new registrar and have been seeing a
>>>>> gradual increase in the memory footprint - enough that I'm having to expand
>>>>> the RAM (its virtualised) to ensure it doesn't run out.
>>>>>
>>>>>
>>>>>
>>>>> You can see a diff of the statistics collected last night at 11pm and
>>>>> today at 3pm here:
>>>>> https://gist.github.com/spacetourist/2103503674e134bd598c7f1e3a82674c/revisions
>>>>>
>>>>>
>>>>>
>>>>> Processes 5-9 are my UDP SIP receiver threads (autoscaled down from an
>>>>> initial footprint of 20 threads).
>>>>>
>>>>>
>>>>>
>>>>> Using 3.0.1 on CentOS 7 8GB RAM (soon to be 32GB!). Currently OpenSIPs
>>>>> is using all the RAM (minus OS usage) and 2GB of swap. Trying to use dialog
>>>>> and dr clustering if that is significant.  Alos have NAT pings configured
>>>>> for all registrations (4000 at time of writing).
>>>>>
>>>>>
>>>>>
>>>>> I am using runtime configuration flags of "*-m 2048 -M 4096*" and am
>>>>> concerned that these were (way) too high, I think I've misinterpreted their
>>>>> meaning during initial setup. Is this a ridiculous setting for my
>>>>> environment? Is it just as simple as OpenSIPs being greedy with the memory
>>>>> such that it doesn't bother to free anything while each process free space
>>>>> remaining? Should my -M value * max number of processes fit into my RAM? I
>>>>> guess with an 8GB system that would mean dropping this to "-M 256"?
>>>>>
>>>>>
>>>>>
>>>>> I've done some research into the issue however I haven't found
>>>>> anything else that would be an obvious target so wondered if the community
>>>>> might have some ideas of where I can begin investigations.
>>>>>
>>>>>
>>>>>
>>>>> Many thanks,
>>>>>
>>>>>
>>>>>
>>>>> Callum
>>>>>
>>>>>
>>>>>
>>>>> [image: Image removed by sender.]
>>>>>
>>>>>
>>>>>
>>>>> *0333 332 0000  |  www.x-on.co.uk <http://www.x-on.co.uk>  |  * *[image:
>>>>> Image removed by sender.] <https://www.linkedin.com/company/x-on>  [image:
>>>>> Image removed by sender.] <https://www.facebook.com/XonTel>  [image: Image
>>>>> removed by sender.] <https://twitter.com/xonuk>*
>>>>>
>>>>> X-on is a trading name of Storacall Technology Ltd a limited company
>>>>> registered in England and Wales.
>>>>> Registered Office : Avaland House, 110 London Road, Apsley, Hemel
>>>>> Hempstead, Herts, HP3 9SD. Company Registration No. 2578478.
>>>>> The information in this e-mail is confidential and for use by the
>>>>> addressee(s) only. If you are not the intended recipient, please notify
>>>>> X-on immediately on +44(0)333 332 0000 and delete the
>>>>> message from your computer. If you are not a named addressee you must
>>>>> not use, disclose, disseminate, distribute, copy, print or reply to this
>>>>> email. Views or opinions expressed by an individual
>>>>> within this email may not necessarily reflect the views of X-on or its
>>>>> associated companies. Although X-on routinely screens for viruses,
>>>>> addressees should scan this email and any attachments
>>>>> for viruses. X-on makes no representation or warranty as to the
>>>>> absence of viruses in this email or any attachments.
>>>>> _______________________________________________
>>>>> Users mailing list
>>>>> Users at lists.opensips.org
>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>>>>
>>>>
>>>
>>> *0333 332 0000  |  www.x-on.co.uk <http://www.x-on.co.uk>  |   **
>>> <https://www.linkedin.com/company/x-on>   <https://www.facebook.com/XonTel>
>>>   <https://twitter.com/xonuk> *
>>>
>>> X-on is a trading name of Storacall Technology Ltd a limited company
>>> registered in England and Wales.
>>> Registered Office : Avaland House, 110 London Road, Apsley, Hemel
>>> Hempstead, Herts, HP3 9SD. Company Registration No. 2578478.
>>> The information in this e-mail is confidential and for use by the
>>> addressee(s) only. If you are not the intended recipient, please notify
>>> X-on immediately on +44(0)333 332 0000 and delete the
>>> message from your computer. If you are not a named addressee you must
>>> not use, disclose, disseminate, distribute, copy, print or reply to this
>>> email. Views or opinions expressed by an individual
>>> within this email may not necessarily reflect the views of X-on or its
>>> associated companies. Although X-on routinely screens for viruses,
>>> addressees should scan this email and any attachments
>>> for viruses. X-on makes no representation or warranty as to the absence
>>> of viruses in this email or any attachments.
>>>
>>> _______________________________________________
>>> Users mailing list
>>> Users at lists.opensips.org
>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opensips.org
>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>
>
>
> *0333 332 0000  |  www.x-on.co.uk <http://www.x-on.co.uk>  |   **
> <https://www.linkedin.com/company/x-on>   <https://www.facebook.com/XonTel>
>   <https://twitter.com/xonuk> *
>
> X-on is a trading name of Storacall Technology Ltd a limited company
> registered in England and Wales.
> Registered Office : Avaland House, 110 London Road, Apsley, Hemel
> Hempstead, Herts, HP3 9SD. Company Registration No. 2578478.
> The information in this e-mail is confidential and for use by the
> addressee(s) only. If you are not the intended recipient, please notify
> X-on immediately on +44(0)333 332 0000 and delete the
> message from your computer. If you are not a named addressee you must not
> use, disclose, disseminate, distribute, copy, print or reply to this email. Views
> or opinions expressed by an individual
> within this email may not necessarily reflect the views of X-on or its
> associated companies. Although X-on routinely screens for viruses,
> addressees should scan this email and any attachments
> for viruses. X-on makes no representation or warranty as to the absence of
> viruses in this email or any attachments.
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20191204/837e3459/attachment-0001.html>


More information about the Users mailing list