[OpenSIPS-Users] OpenSIPS defunct processes
Bogdan-Andrei Iancu
bogdan at voice-system.ro
Mon May 3 09:48:54 CEST 2010
Hi David,
Based on the "ps" output, it seams that the zombies processes were
forked by opensips worker processes - this does not happen only when
using the exec module (which you do not have) - the only alternative is
that the perl scripts you are using are doing the fork (maybe some perl
function?) and do not properly terminate the extra procs...
Regards,
Bogdan
David Cunningham wrote:
> Hello,
>
> Certainly, here they are from opensips.cfg and I've included the
> modparam in case they help:
>
> loadmodule "db_mysql.so"
> loadmodule "sl.so"
> loadmodule "tm.so"
> loadmodule "usrloc.so"
> loadmodule "auth.so"
> loadmodule "auth_db.so"
> loadmodule "maxfwd.so"
> loadmodule "mi_fifo.so"
> loadmodule "nathelper.so"
> loadmodule "perl.so"
> loadmodule "registrar.so"
> loadmodule "rr.so"
> loadmodule "textops.so"
> loadmodule "uri.so"
>
> modparam( "auth", "nonce_expire", 30 )
> modparam( "auth_db|domain|uri_db|usrloc", "db_url", "mysql://foo" )
> modparam( "auth_db", "calculate_ha1", yes )
> modparam( "auth_db", "password_column", "secret" )
> modparam( "auth_db", "use_domain", 0 )
> modparam( "auth_db", "user_column", "name" )
> modparam( "mi_fifo", "fifo_name", "/tmp/opensips_fifo" )
> modparam( "nathelper", "natping_interval", 240 )
> modparam( "nathelper", "ping_nated_only", 1 )
> modparam( "nathelper", "sipping_bflag", 1 )
> modparam( "nathelper", "sipping_from", "sip:keepalive at foo" )
> modparam( "nathelper|registrar", "received_avp", "$avp(i:42)" )
> modparam( "perl", "filename", "/path/to/OpenSIPS.pm" )
> modparam( "perl", "modpath", "/path/to/perllib" )
> modparam( "registrar", "append_branches", 1 )
> modparam( "rr", "enable_full_lr", 1 )
> modparam( "usrloc", "db_mode", 2 )
> modparam( "usrloc", "desc_time_order", 1 )
> modparam( "usrloc", "nat_bflag", 1 )
> modparam( "usrloc", "timer_interval", 5 )
>
>
> Thank you!
>
> On Wed, Apr 28, 2010 at 4:53 PM, Bogdan-Andrei Iancu
> <bogdan at voice-system.ro> wrote:
>
>> Hi David,
>>
>> by chance, using the "exec" module ?
>>
>> Or, can you list the modules you are using ?
>>
>> Regards,
>> Bogdan
>>
>> David Cunningham wrote:
>>
>>> Hello,
>>>
>>> Thank you for the reply. I checked the parent of the zombie processes,
>>> and they seem to be "SIP receiver" processes as per the following "ps
>>> -ef" extract and "opensipsctl fifo ps" information.
>>> We're not running the "respawn" patch.
>>>
>>> Any more advice very welcome, thanks again!
>>>
>>>
>>> user 5830 1 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5832 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5833 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5834 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5835 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5836 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5837 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5838 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5839 5830 0 06:38 ? 00:00:16 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5840 5830 0 06:38 ? 00:00:01 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5841 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5842 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5843 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5844 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5845 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5846 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5847 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5848 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5849 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5850 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 5851 5830 0 06:38 ? 00:00:00 /sbin/opensips -m 256 -P
>>> /var/run/user/opensips.pid
>>> user 7260 5833 0 08:30 ? 00:00:00 [opensips] <defunct>
>>> user 7261 5833 0 08:30 ? 00:00:00 [opensips] <defunct>
>>> user 7262 5833 0 08:30 ? 00:00:00 [opensips] <defunct>
>>> user 7263 5833 0 08:30 ? 00:00:00 [opensips] <defunct>
>>> user 7264 5833 0 08:30 ? 00:00:00 [opensips] <defunct>
>>> user 7265 5833 0 08:30 ? 00:00:00 [opensips] <defunct>
>>> user 9770 5835 0 08:37 ? 00:00:00 [opensips] <defunct>
>>> user 9771 5835 0 08:37 ? 00:00:00 [opensips] <defunct>
>>> user 9772 5835 0 08:37 ? 00:00:00 [opensips] <defunct>
>>> user 9838 5834 0 08:38 ? 00:00:00 [opensips] <defunct>
>>> user 9839 5834 0 08:38 ? 00:00:00 [opensips] <defunct>
>>> user 15519 5833 0 08:57 ? 00:00:00 [opensips] <defunct>
>>> user 15520 5833 0 08:57 ? 00:00:00 [opensips] <defunct>
>>> user 15521 5833 0 08:57 ? 00:00:00 [opensips] <defunct>
>>> user 15522 5833 0 08:57 ? 00:00:00 [opensips] <defunct>
>>>
>>>
>>> [root at hostname ~]# opensipsctl fifo ps
>>> Process:: ID=0 PID=5830 Type=attendant
>>> Process:: ID=1 PID=5832 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060
>>> Process:: ID=2 PID=5833 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060
>>> Process:: ID=3 PID=5834 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060
>>> Process:: ID=4 PID=5835 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060
>>> Process:: ID=5 PID=5836 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060
>>> Process:: ID=6 PID=5837 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060
>>> Process:: ID=7 PID=5838 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060
>>> Process:: ID=8 PID=5839 Type=SIP receiver udp:xxx.xxx.xxx.xxx:5060
>>> Process:: ID=9 PID=5840 Type=timer
>>> Process:: ID=10 PID=5841 Type=timer
>>> Process:: ID=11 PID=5842 Type=MI FIFO
>>> Process:: ID=12 PID=5843 Type=TCP receiver
>>> Process:: ID=13 PID=5844 Type=TCP receiver
>>> Process:: ID=14 PID=5845 Type=TCP receiver
>>> Process:: ID=15 PID=5846 Type=TCP receiver
>>> Process:: ID=16 PID=5847 Type=TCP receiver
>>> Process:: ID=17 PID=5848 Type=TCP receiver
>>> Process:: ID=18 PID=5849 Type=TCP receiver
>>> Process:: ID=19 PID=5850 Type=TCP receiver
>>> Process:: ID=20 PID=5851 Type=TCP main
>>>
>>>
>>>
>>> On Mon, Apr 26, 2010 at 12:22 PM, Bogdan-Andrei Iancu
>>> <bogdan at voice-system.ro> wrote:
>>>
>>>
>>>> Hi David,
>>>>
>>>> Let's try and see what's the parent process of the zombie procs -> check
>>>> with ps and correlate (for the name) with "opensipsctl fifo ps"
>>>>
>>>> I guess the parent of the zombies should be the "attendant proc" . BTW,
>>>> are you running with the "respawn" patch ?
>>>>
>>>> Regards,
>>>> Bogdan
>>>>
>>>> David Cunningham wrote:
>>>>
>>>>
>>>>> Hello,
>>>>>
>>>>> Thanks again for your assistance!
>>>>>
>>>>> We're not using the mi_xmlrpc module.
>>>>>
>>>>> Were you suggesting using gdb on the zombi process? I tried and got
>>>>> the following:
>>>>>
>>>>> user 31183 12140 0 10:31 ? 00:00:00 [opensips] <defunct>
>>>>> [root at sip01 ~]# gdb /sbin/opensips 31183
>>>>> GNU gdb Fedora (6.8-37.el5)
>>>>> Copyright (C) 2008 Free Software Foundation, Inc.
>>>>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>>>>> This is free software: you are free to change and redistribute it.
>>>>> There is NO WARRANTY, to the extent permitted by law. Type "show copying"
>>>>> and "show warranty" for details.
>>>>> This GDB was configured as "x86_64-redhat-linux-gnu"...
>>>>> Attaching to program: /sbin/opensips, process 31183
>>>>> ptrace: Operation not permitted.
>>>>> /root/31183: No such file or directory.
>>>>>
>>>>> We havn't tested 1.6 in production but might be willing to go that
>>>>> road if you're confident it will solve our problems.
>>>>>
>>>>> Much appreciate your help.
>>>>>
>>>>>
>>>>> On Thu, Apr 22, 2010 at 6:07 PM, Bogdan-Andrei Iancu
>>>>> <bogdan at voice-system.ro> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> Hi David,
>>>>>>
>>>>>> David Cunningham wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> Thank you for the reply!
>>>>>>>
>>>>>>> The log doesn't say anything useful, just "Listening on" and then the
>>>>>>> UDP and TCP IP address and port, and "Aliases" also with UDP and TCP
>>>>>>> addresses and ports. I did set "debug = 9" in
>>>>>>> /etc/opensips/opensips.cfg but this caused all phones registered with
>>>>>>> OpenSIPS to give "NO SERVICE" and we disabled debugging immediately.
>>>>>>> It's a busy system.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> do not do that again - full debug slows down your system !
>>>>>>
>>>>>>
>>>>>>
>>>>>>> The defunct processes don't just happen when shutting OpenSIPS down
>>>>>>> either - they are building up while OpenSIPS is running, at a rate of
>>>>>>> about one every few minutes.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> by chance, are you using mi_xmlrpc module ?
>>>>>>
>>>>>>
>>>>>>
>>>>>>> I'm not sure how to get the information I need with gdb. I attached to
>>>>>>> the attendant process (7811) and ran 'bt' which gave the following:
>>>>>>>
>>>>>>> (gdb) bt
>>>>>>> #0 0x00000030e9298570 in __pause_nocancel () from /lib64/libc.so.6
>>>>>>> #1 0x0000000000426f1a in main (argc=5, argv=0x7fff771e8c08) at main.c:867
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> That indicates the attendant processes - is this one in zombi state too ?
>>>>>> important is to check the zombi procs.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Can anyone point me where to go from here - maybe advice on what gdb
>>>>>>> commands would help?
>>>>>>>
>>>>>>> I should have mentioned that this is on version 1.4.5-notls. We
>>>>>>> originally saw it on 1.4.3-notls and upgraded to try and fix it.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> I would strongly recommend to update to 1.6.
>>>>>>
>>>>>> Regards,
>>>>>> Bogdan
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Thanks in advance!
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Apr 21, 2010 at 8:55 AM, Bogdan-Andrei Iancu
>>>>>>> <bogdan at voice-system.ro> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> the defunct procs seams to be the children of a still running opensips
>>>>>>>> proc - this may be the attendant process which, for whatever reasons is
>>>>>>>> not stopping (after killing the children procs).
>>>>>>>>
>>>>>>>> Checks what this process is doing (see top, try attaching with gdb).
>>>>>>>>
>>>>>>>> Also, does the log say something? errors? shutdown triggered?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Bogdan
>>>>>>>>
>>>>>>>> David Cunningham wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> We have a server which is creating a lot of defunct OpenSIPS
>>>>>>>>> processes. An example process tree is below (from ps -ef --forest).
>>>>>>>>>
>>>>>>>>> I have no idea where to start looking for the cause of this. Any
>>>>>>>>> suggestions very welcome!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>
>>>
>>>
--
Bogdan-Andrei Iancu
www.voice-system.ro
More information about the Users
mailing list