[OpenSIPS-Users] sched_yield()

Alex Massover alex at jajah.com
Thu Jan 21 14:37:27 CET 2010


Hi!

Yes, with -DF_MALLOC.

1.6.1 from sources, I build deb package.
I use 128M of shared and 10*1024*1024 private memory (can increase - no problem).

Hmmmm, "opensipsctl fifo get_statistics all" crashes/stops the opensips.

'fifo uptime' or 'fifo debug' are OK.

strace while 'fifo get_statistics all':
Process 9509 attached - interrupt to quit
pause()                                 = ? ERESTARTNOHAND (To be restarted)
--- SIGUSR2 (User defined signal 2) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
pause()                                 = ? ERESTARTNOHAND (To be restarted)
--- SIGCHLD (Child exited) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
waitpid(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGUSR2}], WNOHANG) = 9520
waitpid(-1, 0xbf84b4c8, WNOHANG)        = 0
kill(0, SIGTERM)                        = 0
--- SIGTERM (Terminated) @ 0 (0) ---
--- SIGCHLD (Child exited) @ 0 (0) ---
sigreturn()                             = ? (mask now [TERM])
sigreturn()                             = ? (mask now [])
rt_sigaction(SIGALRM, {0x8065920, [ALRM], SA_RESTART}, {SIG_DFL}, 8) = 0
alarm(60)                               = 0
wait4(-1, NULL, 0, NULL)                = 9514
wait4(-1, NULL, 0, NULL)                = 9519
wait4(-1, NULL, 0, NULL)                = 9521
wait4(-1, NULL, 0, NULL)                = 9522
wait4(-1, NULL, 0, NULL)                = 9512
--- SIGCHLD (Child exited) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
--- SIGCHLD (Child exited) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
wait4(-1, NULL, 0, NULL)                = 9510
wait4(-1, NULL, 0, NULL)                = 9516
--- SIGCHLD (Child exited) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
--- SIGCHLD (Child exited) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
wait4(-1, NULL, 0, NULL)                = 9515
wait4(-1, NULL, 0, NULL)                = 9517
wait4(-1, NULL, 0, NULL)                = 9524
wait4(-1, NULL, 0, NULL)                = 9525
--- SIGCHLD (Child exited) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
--- SIGCHLD (Child exited) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
--- SIGCHLD (Child exited) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
--- SIGCHLD (Child exited) @ 0 (0) ---
sigreturn()                             = ? (mask now [])
wait4(-1, NULL, 0, NULL)                = 9511
wait4(-1, NULL, 0, NULL)                = 9513
wait4(-1, NULL, 0, NULL)                = 9518
wait4(-1, NULL, 0, NULL)                = 9523
wait4(-1, NULL, 0, NULL)                = -1 ECHILD (No child processes)
rt_sigaction(SIGALRM, {0x8066080, [ALRM], SA_RESTART}, {0x8065920, [ALRM], SA_RESTART}, 8) = 0
stat64("/tmp/opensips_fifo", {st_mode=S_IFIFO|0660, st_size=0, ...}) = 0
unlink("/tmp/opensips_fifo")            = 0
munmap(0xaed25000, 134217728)           = 0
unlink("/var/run/opensips/opensips.pid") = 0
alarm(0)                                = 60
rt_sigaction(SIGALRM, {SIG_IGN}, {0x8066080, [ALRM], SA_RESTART}, 8) = 0
exit_group(0)                           = ?
Process 9509 detached

--
Best Regards,
Alex Massover
VoIP R&D TL
Jajah Inc.

> -----Original Message-----
> From: users-bounces at lists.opensips.org [mailto:users-
> bounces at lists.opensips.org] On Behalf Of Andrei Dragus
> Sent: Thursday, January 21, 2010 3:09 PM
> To: OpenSIPS users mailling list
> Subject: Re: [OpenSIPS-Users] sched_yield()
>
>
> Hi,
>
> Since all the backtraces are in allocation routines my guess is that
> the
> shared memory lock might be causing a problem.
>
> Are you compiling with -DF_MALLOC?
> What version of OpenSIPS are you using?
> What is the total shared memory pool you are allocating?
> What amount of memory are you using? ( Use : opensipsctl fifo
> get_statistics all )
>
> Alex Massover wrote:
> > Some more,
> >
> > (gdb) bt
> > #0  0xb78dc424 in __kernel_vsyscall ()
> > #1  0xb781741c in sched_yield () from /lib/i686/cmov/libc.so.6
> > #2  0xb73d77fd in build_new_dlg () from
> /usr/lib/opensips/modules/dialog.so
> > #3  0xb73d4b81 in dlg_create_dialog () from
> /usr/lib/opensips/modules/dialog.so
> > #4  0xb73c9c9e in ?? () from /usr/lib/opensips/modules/dialog.so
> > #5  0x08055030 in do_action ()
> > #6  0x08053ebf in run_action_list ()
> > #7  0x08056e7a in do_action ()
> > #8  0x08053ebf in run_action_list ()
> > #9  0x08057d99 in run_top_route ()
> > #10 0x0808ad6c in receive_msg ()
> > #11 0x080bd2f2 in udp_rcv_loop ()
> > #12 0x08069339 in main ()
> >
> >
> >    (gdb) bt
> > #0  0xb78dc424 in __kernel_vsyscall ()
> > #1  0xb781741c in sched_yield () from /lib/i686/cmov/libc.so.6
> > #2  0xb77242cd in build_cell () from /usr/lib/opensips/modules/tm.so
> > #3  0xb7739c4a in t_newtran () from /usr/lib/opensips/modules/tm.so
> > #4  0xb772e7b8 in t_relay_to () from /usr/lib/opensips/modules/tm.so
> > #5  0xb773b501 in ?? () from /usr/lib/opensips/modules/tm.so
> > #6  0x08055030 in do_action ()
> > #7  0x08053ebf in run_action_list ()
> > #8  0x08095cf2 in eval_expr ()
> > #9  0x080958d9 in eval_expr ()
> > #10 0x08095919 in eval_expr ()
> > #11 0x080554e2 in do_action ()
> > #12 0x08053ebf in run_action_list ()
> > #13 0x080569d8 in do_action ()
> > #14 0x08053ebf in run_action_list ()
> > #15 0x08056e7a in do_action ()
> > #16 0x08053ebf in run_action_list ()
> > #17 0x08057d99 in run_top_route ()
> > #18 0x0808ad6c in receive_msg ()
> > #19 0x080bd2f2 in udp_rcv_loop ()
> > #20 0x08069339 in main ()
> >
> > --
> > Best Regards,
> > Alex Massover
> > VoIP R&D TL
> > Jajah Inc.
> >
> >
> >> -----Original Message-----
> >> From: users-bounces at lists.opensips.org [mailto:users-
> >> bounces at lists.opensips.org] On Behalf Of Alex Massover
> >> Sent: Thursday, January 21, 2010 2:24 PM
> >> To: OpenSIPS users mailling list
> >> Subject: Re: [OpenSIPS-Users] sched_yield()
> >>
> >> Hi,
> >>
> >> Another one.. It hangs for a number of seconds (but it's enough to
> >> cause to SIP timeouts - MSG queue jumps to 260K), it's hard to make
> a
> >> bt at the right moment.
> >> This one looks better because there's sched_yield() there :)
> >>
> >> (gdb) bt
> >> #0  0xb77d5424 in __kernel_vsyscall ()
> >> #1  0xb771041c in sched_yield () from /lib/i686/cmov/libc.so.6
> >> #2  0x080bf23d in new_avp ()
> >> #3  0x080bf53f in add_avp ()
> >> #4  0xb72c1c9c in ?? () from /usr/lib/opensips/modules/dialog.so
> >> #5  0x08055030 in do_action ()
> >> #6  0x08053ebf in run_action_list ()
> >> #7  0x08056e7a in do_action ()
> >> #8  0x08053ebf in run_action_list ()
> >> #9  0x08056e7a in do_action ()
> >> #10 0x08053ebf in run_action_list ()
> >> #11 0x08056e7a in do_action ()
> >> #12 0x08053ebf in run_action_list ()
> >> #13 0x08057d99 in run_top_route ()
> >> #14 0x0808ad6c in receive_msg ()
> >> #15 0x080bd2f2 in udp_rcv_loop ()
> >> #16 0x08069339 in main ()
> >>
> >> --
> >> Best Regards,
> >> Alex Massover
> >> VoIP R&D TL
> >> Jajah Inc.
> >>
> >>
> >>> -----Original Message-----
> >>> From: users-bounces at lists.opensips.org [mailto:users-
> >>> bounces at lists.opensips.org] On Behalf Of Alex Massover
> >>> Sent: Thursday, January 21, 2010 2:05 PM
> >>> To: OpenSIPS users mailling list
> >>> Subject: Re: [OpenSIPS-Users] sched_yield()
> >>>
> >>> Hi Andrei,
> >>> Hopefully this is it (with FASTLOCK)
> >>>
> >>> #0  0xb77d5424 in __kernel_vsyscall ()
> >>> #1  0xb772babb in poll () from /lib/i686/cmov/libc.so.6
> >>> #2  0xb77ba83a in ?? () from /lib/i686/cmov/libresolv.so.2
> >>> #3  0xb77b8946 in __libc_res_nquery () from
> >>> /lib/i686/cmov/libresolv.so.2
> >>> #4  0xb77b8fdb in ?? () from /lib/i686/cmov/libresolv.so.2
> >>> #5  0xb77b92ae in __libc_res_nsearch () from
> >>> /lib/i686/cmov/libresolv.so.2
> >>> #6  0xb77b96d4 in __res_nsearch () from
> /lib/i686/cmov/libresolv.so.2
> >>> #7  0xb77b808a in res_search () from /lib/i686/cmov/libresolv.so.2
> >>> #8  0x0808c613 in get_record ()
> >>> #9  0x0808cf05 in ?? ()
> >>> #10 0x0808e385 in sip_resolvehost ()
> >>> #11 0x0807a26c in mk_proxy ()
> >>> #12 0xb7627d39 in t_relay_to () from
> /usr/lib/opensips/modules/tm.so
> >>> #13 0xb7634501 in ?? () from /usr/lib/opensips/modules/tm.so
> >>> #14 0x08055030 in do_action ()
> >>> #15 0x08053ebf in run_action_list ()
> >>> #16 0x08095cf2 in eval_expr ()
> >>> #17 0x080958d9 in eval_expr ()
> >>> #18 0x08095919 in eval_expr ()
> >>> #19 0x080554e2 in do_action ()
> >>> #20 0x08053ebf in run_action_list ()
> >>> #21 0x08056e7a in do_action ()
> >>> #22 0x08053ebf in run_action_list ()
> >>> ---Type <return> to continue, or q <return> to quit---
> >>> #23 0x080569d8 in do_action ()
> >>> #24 0x08053ebf in run_action_list ()
> >>> #25 0x08056e7a in do_action ()
> >>> #26 0x08053ebf in run_action_list ()
> >>> #27 0x08057d99 in run_top_route ()
> >>> #28 0x0808ad6c in receive_msg ()
> >>> #29 0x080bd2f2 in udp_rcv_loop ()
> >>> #30 0x08069339 in main ()
> >>> (gdb)
> >>>
> >>> --
> >>> Best Regards,
> >>> Alex Massover
> >>> VoIP R&D TL
> >>> Jajah Inc.
> >>>
> >>>> -----Original Message-----
> >>>> From: users-bounces at lists.opensips.org [mailto:users-
> >>>> bounces at lists.opensips.org] On Behalf Of Andrei Dragus
> >>>> Sent: Wednesday, January 20, 2010 2:58 PM
> >>>> To: OpenSIPS users mailling list
> >>>> Subject: Re: [OpenSIPS-Users] sched_yield()
> >>>>
> >>>> Hi,
> >>>>
> >>>> I think that there is a lock that is being held more than it
> should
> >>>>
> >>> be
> >>>
> >>>> and that's what causes starvation. It would help us if you could
> >>>>
> >>> attach
> >>>
> >>>> to a process using gdb and give us a full backtrace.
> >>>>
> >>>> Temporary solutions which should work would be to reduce the
> number
> >>>>
> >>> of
> >>>
> >>>> processes to 4-6 or to recompile replacing -DFAST_LOCK with one of
> >>>>
> >>> the
> >>>
> >>>> other options (-DUSE_POSIX_SEM or -DUSE_PTHREAD_MUTEX) but we
> >>>>
> >> should
> >>
> >>>> see
> >>>> where this is from to fix it.
> >>>>
> >>>> Alex Massover wrote:
> >>>>
> >>>>> Hi!
> >>>>>
> >>>>> Yes, from the source on debian, I build deb package. (I did some
> >>>>>
> >>>> minor changes to the source, but the problem happens also without
> >>>>
> >> my
> >>
> >>>> changes)
> >>>>
> >>>>> 16 children on 4 cores.
> >>>>>
> >>>>> What do you suggest to reduce it to 4? It runs on 2.6.32 on
> >>>>>
> >> VMware
> >>
> >>>> ESX.
> >>>>
> >>>>> I'm also trying now sleep(0) instead of sched_yield().
> >>>>>
> >>>>> --
> >>>>> Best Regards,
> >>>>> Alex Massover
> >>>>> VoIP R&D TL
> >>>>> Jajah Inc.
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: users-bounces at lists.opensips.org [mailto:users-
> >>>>>> bounces at lists.opensips.org] On Behalf Of Andrei Dragus
> >>>>>> Sent: Wednesday, January 20, 2010 1:05 PM
> >>>>>> To: OpenSIPS users mailling list
> >>>>>> Subject: Re: [OpenSIPS-Users] sched_yield()
> >>>>>>
> >>>>>> Hi Alex,
> >>>>>>
> >>>>>> Are you building OpenSIPS from source?
> >>>>>> How many processes do you have and on how many cores?
> >>>>>>
> >>>>>>
> >>>>>> Alex Massover wrote:
> >>>>>>
> >>>>>>
> >>>>>>> Hello!
> >>>>>>>
> >>>>>>> I'm facing a strange problem, sometimes under a stress OpenSIPS
> >>>>>>> "locks" - load average jumps, SIP processing delays, opensips
> >>>>>>>
> >> msg
> >>
> >>>>>>> queue fills with a lot of sip messages, opensips processes
> >>>>>>>
> >> start
> >>
> >>> to
> >>>
> >>>>>>> comsume a lot of CPU.
> >>>>>>>
> >>>>>>> And strace shows:
> >>>>>>>
> >>>>>>> sched_yield()
> >>>>>>>
> >>>>>>> sched_yield()
> >>>>>>>
> >>>>>>> sched_yield()
> >>>>>>>
> >>>>>>> sched_yield()
> >>>>>>>
> >>>>>>> ....
> >>>>>>>
> >>>>>>> for all processes.
> >>>>>>>
> >>>>>>> If I stop the stress - after a while (not immediately) - it
> >>>>>>>
> >>>> unlocks,
> >>>>
> >>>>>>> also suddenly, I can see in top that all opensips processes
> >>>>>>>
> >> stop
> >>
> >>> to
> >>>
> >>>>>>> consume CPU.
> >>>>>>>
> >>>>>>> What can it be? Some kind of starvation?
> >>>>>>>
> >>>>>>> --
> >>>>>>>
> >>>>>>> Best Regards,
> >>>>>>>
> >>>>>>> Alex Massover
> >>>>>>>
> >>>>>>> VoIP R&D TL
> >>>>>>>
> >>>>>>> Jajah Inc.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> This mail was sent via Mail-SeCure System.
> >>>>>>> ---------------------------------------------------------------
> >>>>>>>
> >> --
> >>
> >>> --
> >>>
> >>>> --
> >>>>
> >>>>>> ---
> >>>>>>
> >>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> Users mailing list
> >>>>>>> Users at lists.opensips.org
> >>>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>> --
> >>>>>> Andrei Dragus
> >>>>>> www.voice-system.ro
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Users mailing list
> >>>>>> Users at lists.opensips.org
> >>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>>>
> >>>>>> This mail was received via Mail-SeCure System.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>> This mail was sent via Mail-SeCure System.
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Users mailing list
> >>>>> Users at lists.opensips.org
> >>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>>
> >>>>>
> >>>> --
> >>>> Andrei Dragus
> >>>> www.voice-system.ro
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Users mailing list
> >>>> Users at lists.opensips.org
> >>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>
> >>>> This mail was received via Mail-SeCure System.
> >>>>
> >>>>
> >>> This mail was sent via Mail-SeCure System.
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Users mailing list
> >>> Users at lists.opensips.org
> >>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>
> >>> This mail was received via Mail-SeCure System.
> >>>
> >>>
> >> This mail was sent via Mail-SeCure System.
> >>
> >>
> >>
> >> _______________________________________________
> >> Users mailing list
> >> Users at lists.opensips.org
> >> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>
> >> This mail was received via Mail-SeCure System.
> >>
> >>
> >
> >
> > This mail was sent via Mail-SeCure System.
> >
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users at lists.opensips.org
> > http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >
>
> --
> Andrei Dragus
> www.voice-system.ro
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
> This mail was received via Mail-SeCure System.
>


This mail was sent via Mail-SeCure System.





More information about the Users mailing list