[OpenSIPS-Users] sched_yield()
Alex Massover
alex at jajah.com
Sun Jan 24 14:02:47 CET 2010
Hi!
I use dialog to store/retrieve variables, but without profiling.
Looks like I found the problem - VMware has a memory balloon, it allows overcommiting physical memory to virtual machines (provided that not all guests need all the memory all the time). Usually it behaves OK, but has a dramatic performance effect on OpenSIPS. Probably the memory balloon is aware of how system memory management works but unaware of OpenSIPS internal memory manager.
After removing memory balloon driver no hangs anymore with 4/8 children. But single working child worked well even before removing the balloon (and I'm on 4-way SMP)!
Looks like there's no rule how many children to configure, it depends on modules in-use, memory speed, cpu speed and so on. Only stress test for each concrete system gives an answer.
Hope the new architecture will take care of such issues as well :)
--
Best Regards,
Alex Massover
VoIP R&D TL
Jajah Inc.
> -----Original Message-----
> From: users-bounces at lists.opensips.org [mailto:users-
> bounces at lists.opensips.org] On Behalf Of Bogdan-Andrei Iancu
> Sent: Friday, January 22, 2010 7:43 PM
> To: OpenSIPS users mailling list
> Subject: Re: [OpenSIPS-Users] sched_yield()
>
> Hi Alex,
>
> Bug was fixed - update from SVN.
>
> Regarding your observation on"forking" versus "no-forking" - in some
> cases (when not doing any blocking ops), a single proc may be faster
> that multiple procs on a single core machine - because the CPU power is
> the same and maximum used (no blocking), but in forking mode you have
> the overhead of proc switching and the loocking/synchronizing dead-
> times.
>
> Regards,
> Bogdan
>
> Alex Massover wrote:
> > Hi,
> >
> > Unfortunately 'fifo get_statistics' crashes opensips, I opened a bug.
> > But no chance that 1G is not enough, only about 400M is used for all
> linux processes:
> > Mem: 3115120k total, 398360k used, 2716760k free, 536k
> buffers
> >
> > Maybe sched_yield() just cause problems on 2.3.62 or on vmware or on
> SMP?
> >
> > I'm trying now with fork=yes and children=1.
> > If I have only one working child, does it suppose to lock and
> shed_yeild() itself from any reason?
> >
> > Meanwhile with single child OpenSIPS easily handles 4K of concurrent
> calls at 15cps, load average is 0.00 (!) and CPU is about 96% idle.
> >
> > I wonder if single working child also hangs.
> >
> > --
> > Best Regards,
> > Alex Massover
> > VoIP R&D TL
> > Jajah Inc.
> >
> >> -----Original Message-----
> >> From: users-bounces at lists.opensips.org [mailto:users-
> >> bounces at lists.opensips.org] On Behalf Of Andrei Dragus
> >> Sent: Friday, January 22, 2010 1:17 PM
> >> To: OpenSIPS users mailling list
> >> Subject: Re: [OpenSIPS-Users] sched_yield()
> >>
> >> Hi,
> >>
> >> The new f_malloc will not do anything extra when compared to the old
> >> one
> >> until memory usage goes way up.
> >> I've added a warning in mem/f_malloc.c so you can see when defrag
> >> starts. If you get this warning then it is clear that the problem is
> >> from high memory usage.
> >>
> >> 1 GB for 4k calls seems a lot ( 250k per call). You can try to use
> >> "opensipsctl fifo get_statistics shmem:" and see what the memory
> usage
> >> is for diferent number of concurrent calls ( 1k,2k,3k,4k), and if
> >> indeed
> >> the memory usage is that high we should investigate the cause.
> >>
> >>
> >> Alex Massover wrote:
> >>
> >>> Hi,
> >>>
> >>> Now shared memory is 1G (-m 1024), and all memory is dedicated to
> >>>
> >> the virtual machine (it was shared till now).
> >>
> >>> But it still happens, just not so often.
> >>>
> >>> I originate the calls for this stress test in Asterisk with the
> same
> >>>
> >> resources and looks like Asterisk performs much better than
> OpenSIPS.
> >> How can it be?
> >>
> >>> In my stress OpenSIPS does no blocking/slow requests. And it's just
> >>>
> >> 4K concurrent calls, each one is 2-3 min.
> >>
> >>> Maybe OpenSIPS does too much low level memory management and
> virtual
> >>>
> >> machine is not suitable for it (despite that Asterisk runs well over
> >> VMware)?
> >>
> >>> I'm not sure but I have a feeling that 1.4 performed better. What
> can
> >>>
> >> cause performance degradation in 1.6? Storing vars on dialog, new
> >> malloc()?
> >>
> >>> gdb) bt
> >>> #0 0xb78ad424 in __kernel_vsyscall ()
> >>> #1 0xb77e841c in sched_yield () from /lib/i686/cmov/libc.so.6
> >>> #2 0x080bf23d in new_avp ()
> >>> #3 0x080bf53f in add_avp ()
> >>> #4 0x08080e6e in pv_set_avp ()
> >>> #5 0x0808229c in pv_set_value ()
> >>> #6 0x08053c9d in do_assign ()
> >>> #7 0x0805447a in do_action ()
> >>> #8 0x08053ebf in run_action_list ()
> >>> #9 0x08056e7a in do_action ()
> >>> #10 0x08053ebf in run_action_list ()
> >>> #11 0x08056e7a in do_action ()
> >>> #12 0x08053ebf in run_action_list ()
> >>> #13 0x080569d8 in do_action ()
> >>> #14 0x08053ebf in run_action_list ()
> >>> #15 0x08056e7a in do_action ()
> >>> #16 0x08053ebf in run_action_list ()
> >>> #17 0x08057d99 in run_top_route ()
> >>> #18 0x0808ad6c in receive_msg ()
> >>> #19 0x080bd2f2 in udp_rcv_loop ()
> >>> #20 0x08069339 in main ()
> >>> (gdb) quit
> >>>
> >>>
> >>> (gdb) bt
> >>> #0 0xb78ad424 in __kernel_vsyscall ()
> >>> #1 0xb77e841c in sched_yield () from /lib/i686/cmov/libc.so.6
> >>> #2 0xb76f52cd in build_cell () from
> /usr/lib/opensips/modules/tm.so
> >>> #3 0xb770ac4a in t_newtran () from /usr/lib/opensips/modules/tm.so
> >>> #4 0xb76ff7b8 in t_relay_to () from
> /usr/lib/opensips/modules/tm.so
> >>> #5 0xb770c501 in ?? () from /usr/lib/opensips/modules/tm.so
> >>> #6 0x08055030 in do_action ()
> >>> #7 0x08053ebf in run_action_list ()
> >>> #8 0x08095cf2 in eval_expr ()
> >>> #9 0x080958d9 in eval_expr ()
> >>> #10 0x08095919 in eval_expr ()
> >>> #11 0x080554e2 in do_action ()
> >>> #12 0x08053ebf in run_action_list ()
> >>> #13 0x080569d8 in do_action ()
> >>> #14 0x08053ebf in run_action_list ()
> >>> #15 0x08056e7a in do_action ()
> >>> #16 0x08053ebf in run_action_list ()
> >>> #17 0x08057d99 in run_top_route ()
> >>> #18 0x0808ad6c in receive_msg ()
> >>> #19 0x080bd2f2 in udp_rcv_loop ()
> >>> #20 0x08069339 in main ()
> >>>
> >>>
> >>> --
> >>> Best Regards,
> >>> Alex Massover
> >>> VoIP R&D TL
> >>> Jajah Inc.
> >>>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: users-bounces at lists.opensips.org [mailto:users-
> >>>> bounces at lists.opensips.org] On Behalf Of Andrei Dragus
> >>>> Sent: Thursday, January 21, 2010 3:43 PM
> >>>> To: OpenSIPS users mailling list
> >>>> Subject: Re: [OpenSIPS-Users] sched_yield()
> >>>>
> >>>> My guess is that there is not enough shared memory. When an
> >>>>
> >> allocation
> >>
> >>>> failes OpenSIPS tries to defragment memory to make room which
> takes
> >>>>
> >> a
> >>
> >>>> lot of time and must be done under lock.
> >>>>
> >>>> Please try to increase the shared memory size and tell me if it
> >>>> persists.
> >>>>
> >>>>
> >>>> Alex Massover wrote:
> >>>>
> >>>>
> >>>>> Hi!
> >>>>>
> >>>>> Yes, with -DF_MALLOC.
> >>>>>
> >>>>> 1.6.1 from sources, I build deb package.
> >>>>> I use 128M of shared and 10*1024*1024 private memory (can
> increase
> >>>>>
> >> -
> >>
> >>>> no problem).
> >>>>
> >>>>
> >>>>> Hmmmm, "opensipsctl fifo get_statistics all" crashes/stops the
> >>>>>
> >>>>>
> >>>> opensips.
> >>>>
> >>>>
> >>>>> 'fifo uptime' or 'fifo debug' are OK.
> >>>>>
> >>>>> strace while 'fifo get_statistics all':
> >>>>> Process 9509 attached - interrupt to quit
> >>>>> pause() = ? ERESTARTNOHAND (To be
> >>>>>
> >>>>>
> >>>> restarted)
> >>>>
> >>>>
> >>>>> --- SIGUSR2 (User defined signal 2) @ 0 (0) ---
> >>>>> sigreturn() = ? (mask now [])
> >>>>> pause() = ? ERESTARTNOHAND (To be
> >>>>>
> >>>>>
> >>>> restarted)
> >>>>
> >>>>
> >>>>> --- SIGCHLD (Child exited) @ 0 (0) ---
> >>>>> sigreturn() = ? (mask now [])
> >>>>> waitpid(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGUSR2}],
> WNOHANG)
> >>>>>
> >> =
> >>
> >>>> 9520
> >>>>
> >>>>
> >>>>> waitpid(-1, 0xbf84b4c8, WNOHANG) = 0
> >>>>> kill(0, SIGTERM) = 0
> >>>>> --- SIGTERM (Terminated) @ 0 (0) ---
> >>>>> --- SIGCHLD (Child exited) @ 0 (0) ---
> >>>>> sigreturn() = ? (mask now [TERM])
> >>>>> sigreturn() = ? (mask now [])
> >>>>> rt_sigaction(SIGALRM, {0x8065920, [ALRM], SA_RESTART}, {SIG_DFL},
> >>>>>
> >> 8)
> >>
> >>>> = 0
> >>>>
> >>>>
> >>>>> alarm(60) = 0
> >>>>> wait4(-1, NULL, 0, NULL) = 9514
> >>>>> wait4(-1, NULL, 0, NULL) = 9519
> >>>>> wait4(-1, NULL, 0, NULL) = 9521
> >>>>> wait4(-1, NULL, 0, NULL) = 9522
> >>>>> wait4(-1, NULL, 0, NULL) = 9512
> >>>>> --- SIGCHLD (Child exited) @ 0 (0) ---
> >>>>> sigreturn() = ? (mask now [])
> >>>>> --- SIGCHLD (Child exited) @ 0 (0) ---
> >>>>> sigreturn() = ? (mask now [])
> >>>>> wait4(-1, NULL, 0, NULL) = 9510
> >>>>> wait4(-1, NULL, 0, NULL) = 9516
> >>>>> --- SIGCHLD (Child exited) @ 0 (0) ---
> >>>>> sigreturn() = ? (mask now [])
> >>>>> --- SIGCHLD (Child exited) @ 0 (0) ---
> >>>>> sigreturn() = ? (mask now [])
> >>>>> wait4(-1, NULL, 0, NULL) = 9515
> >>>>> wait4(-1, NULL, 0, NULL) = 9517
> >>>>> wait4(-1, NULL, 0, NULL) = 9524
> >>>>> wait4(-1, NULL, 0, NULL) = 9525
> >>>>> --- SIGCHLD (Child exited) @ 0 (0) ---
> >>>>> sigreturn() = ? (mask now [])
> >>>>> --- SIGCHLD (Child exited) @ 0 (0) ---
> >>>>> sigreturn() = ? (mask now [])
> >>>>> --- SIGCHLD (Child exited) @ 0 (0) ---
> >>>>> sigreturn() = ? (mask now [])
> >>>>> --- SIGCHLD (Child exited) @ 0 (0) ---
> >>>>> sigreturn() = ? (mask now [])
> >>>>> wait4(-1, NULL, 0, NULL) = 9511
> >>>>> wait4(-1, NULL, 0, NULL) = 9513
> >>>>> wait4(-1, NULL, 0, NULL) = 9518
> >>>>> wait4(-1, NULL, 0, NULL) = 9523
> >>>>> wait4(-1, NULL, 0, NULL) = -1 ECHILD (No child
> >>>>>
> >>>>>
> >>>> processes)
> >>>>
> >>>>
> >>>>> rt_sigaction(SIGALRM, {0x8066080, [ALRM], SA_RESTART},
> {0x8065920,
> >>>>>
> >>>>>
> >>>> [ALRM], SA_RESTART}, 8) = 0
> >>>>
> >>>>
> >>>>> stat64("/tmp/opensips_fifo", {st_mode=S_IFIFO|0660, st_size=0,
> >>>>>
> >> ...})
> >>
> >>>> = 0
> >>>>
> >>>>
> >>>>> unlink("/tmp/opensips_fifo") = 0
> >>>>> munmap(0xaed25000, 134217728) = 0
> >>>>> unlink("/var/run/opensips/opensips.pid") = 0
> >>>>> alarm(0) = 60
> >>>>> rt_sigaction(SIGALRM, {SIG_IGN}, {0x8066080, [ALRM], SA_RESTART},
> >>>>>
> >> 8)
> >>
> >>>> = 0
> >>>>
> >>>>
> >>>>> exit_group(0) = ?
> >>>>> Process 9509 detached
> >>>>>
> >>>>> --
> >>>>> Best Regards,
> >>>>> Alex Massover
> >>>>> VoIP R&D TL
> >>>>> Jajah Inc.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: users-bounces at lists.opensips.org [mailto:users-
> >>>>>> bounces at lists.opensips.org] On Behalf Of Andrei Dragus
> >>>>>> Sent: Thursday, January 21, 2010 3:09 PM
> >>>>>> To: OpenSIPS users mailling list
> >>>>>> Subject: Re: [OpenSIPS-Users] sched_yield()
> >>>>>>
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> Since all the backtraces are in allocation routines my guess is
> >>>>>>
> >> that
> >>
> >>>>>> the
> >>>>>> shared memory lock might be causing a problem.
> >>>>>>
> >>>>>> Are you compiling with -DF_MALLOC?
> >>>>>> What version of OpenSIPS are you using?
> >>>>>> What is the total shared memory pool you are allocating?
> >>>>>> What amount of memory are you using? ( Use : opensipsctl fifo
> >>>>>> get_statistics all )
> >>>>>>
> >>>>>> Alex Massover wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> Some more,
> >>>>>>>
> >>>>>>> (gdb) bt
> >>>>>>> #0 0xb78dc424 in __kernel_vsyscall ()
> >>>>>>> #1 0xb781741c in sched_yield () from /lib/i686/cmov/libc.so.6
> >>>>>>> #2 0xb73d77fd in build_new_dlg () from
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>> /usr/lib/opensips/modules/dialog.so
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> #3 0xb73d4b81 in dlg_create_dialog () from
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>> /usr/lib/opensips/modules/dialog.so
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> #4 0xb73c9c9e in ?? () from
> /usr/lib/opensips/modules/dialog.so
> >>>>>>> #5 0x08055030 in do_action ()
> >>>>>>> #6 0x08053ebf in run_action_list ()
> >>>>>>> #7 0x08056e7a in do_action ()
> >>>>>>> #8 0x08053ebf in run_action_list ()
> >>>>>>> #9 0x08057d99 in run_top_route ()
> >>>>>>> #10 0x0808ad6c in receive_msg ()
> >>>>>>> #11 0x080bd2f2 in udp_rcv_loop ()
> >>>>>>> #12 0x08069339 in main ()
> >>>>>>>
> >>>>>>>
> >>>>>>> (gdb) bt
> >>>>>>> #0 0xb78dc424 in __kernel_vsyscall ()
> >>>>>>> #1 0xb781741c in sched_yield () from /lib/i686/cmov/libc.so.6
> >>>>>>> #2 0xb77242cd in build_cell () from
> >>>>>>>
> >>>>>>>
> >>>> /usr/lib/opensips/modules/tm.so
> >>>>
> >>>>
> >>>>>>> #3 0xb7739c4a in t_newtran () from
> >>>>>>>
> >> /usr/lib/opensips/modules/tm.so
> >>
> >>>>>>> #4 0xb772e7b8 in t_relay_to () from
> >>>>>>>
> >>>>>>>
> >>>> /usr/lib/opensips/modules/tm.so
> >>>>
> >>>>
> >>>>>>> #5 0xb773b501 in ?? () from /usr/lib/opensips/modules/tm.so
> >>>>>>> #6 0x08055030 in do_action ()
> >>>>>>> #7 0x08053ebf in run_action_list ()
> >>>>>>> #8 0x08095cf2 in eval_expr ()
> >>>>>>> #9 0x080958d9 in eval_expr ()
> >>>>>>> #10 0x08095919 in eval_expr ()
> >>>>>>> #11 0x080554e2 in do_action ()
> >>>>>>> #12 0x08053ebf in run_action_list ()
> >>>>>>> #13 0x080569d8 in do_action ()
> >>>>>>> #14 0x08053ebf in run_action_list ()
> >>>>>>> #15 0x08056e7a in do_action ()
> >>>>>>> #16 0x08053ebf in run_action_list ()
> >>>>>>> #17 0x08057d99 in run_top_route ()
> >>>>>>> #18 0x0808ad6c in receive_msg ()
> >>>>>>> #19 0x080bd2f2 in udp_rcv_loop ()
> >>>>>>> #20 0x08069339 in main ()
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best Regards,
> >>>>>>> Alex Massover
> >>>>>>> VoIP R&D TL
> >>>>>>> Jajah Inc.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: users-bounces at lists.opensips.org [mailto:users-
> >>>>>>>> bounces at lists.opensips.org] On Behalf Of Alex Massover
> >>>>>>>> Sent: Thursday, January 21, 2010 2:24 PM
> >>>>>>>> To: OpenSIPS users mailling list
> >>>>>>>> Subject: Re: [OpenSIPS-Users] sched_yield()
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> Another one.. It hangs for a number of seconds (but it's
> enough
> >>>>>>>>
> >> to
> >>
> >>>>>>>> cause to SIP timeouts - MSG queue jumps to 260K), it's hard to
> >>>>>>>>
> >>>>>>>>
> >>>> make
> >>>>
> >>>>
> >>>>>> a
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>> bt at the right moment.
> >>>>>>>> This one looks better because there's sched_yield() there :)
> >>>>>>>>
> >>>>>>>> (gdb) bt
> >>>>>>>> #0 0xb77d5424 in __kernel_vsyscall ()
> >>>>>>>> #1 0xb771041c in sched_yield () from /lib/i686/cmov/libc.so.6
> >>>>>>>> #2 0x080bf23d in new_avp ()
> >>>>>>>> #3 0x080bf53f in add_avp ()
> >>>>>>>> #4 0xb72c1c9c in ?? () from
> /usr/lib/opensips/modules/dialog.so
> >>>>>>>> #5 0x08055030 in do_action ()
> >>>>>>>> #6 0x08053ebf in run_action_list ()
> >>>>>>>> #7 0x08056e7a in do_action ()
> >>>>>>>> #8 0x08053ebf in run_action_list ()
> >>>>>>>> #9 0x08056e7a in do_action ()
> >>>>>>>> #10 0x08053ebf in run_action_list ()
> >>>>>>>> #11 0x08056e7a in do_action ()
> >>>>>>>> #12 0x08053ebf in run_action_list ()
> >>>>>>>> #13 0x08057d99 in run_top_route ()
> >>>>>>>> #14 0x0808ad6c in receive_msg ()
> >>>>>>>> #15 0x080bd2f2 in udp_rcv_loop ()
> >>>>>>>> #16 0x08069339 in main ()
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Best Regards,
> >>>>>>>> Alex Massover
> >>>>>>>> VoIP R&D TL
> >>>>>>>> Jajah Inc.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> -----Original Message-----
> >>>>>>>>> From: users-bounces at lists.opensips.org [mailto:users-
> >>>>>>>>> bounces at lists.opensips.org] On Behalf Of Alex Massover
> >>>>>>>>> Sent: Thursday, January 21, 2010 2:05 PM
> >>>>>>>>> To: OpenSIPS users mailling list
> >>>>>>>>> Subject: Re: [OpenSIPS-Users] sched_yield()
> >>>>>>>>>
> >>>>>>>>> Hi Andrei,
> >>>>>>>>> Hopefully this is it (with FASTLOCK)
> >>>>>>>>>
> >>>>>>>>> #0 0xb77d5424 in __kernel_vsyscall ()
> >>>>>>>>> #1 0xb772babb in poll () from /lib/i686/cmov/libc.so.6
> >>>>>>>>> #2 0xb77ba83a in ?? () from /lib/i686/cmov/libresolv.so.2
> >>>>>>>>> #3 0xb77b8946 in __libc_res_nquery () from
> >>>>>>>>> /lib/i686/cmov/libresolv.so.2
> >>>>>>>>> #4 0xb77b8fdb in ?? () from /lib/i686/cmov/libresolv.so.2
> >>>>>>>>> #5 0xb77b92ae in __libc_res_nsearch () from
> >>>>>>>>> /lib/i686/cmov/libresolv.so.2
> >>>>>>>>> #6 0xb77b96d4 in __res_nsearch () from
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>> /lib/i686/cmov/libresolv.so.2
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>> #7 0xb77b808a in res_search () from
> >>>>>>>>>
> >>>>>>>>>
> >>>> /lib/i686/cmov/libresolv.so.2
> >>>>
> >>>>
> >>>>>>>>> #8 0x0808c613 in get_record ()
> >>>>>>>>> #9 0x0808cf05 in ?? ()
> >>>>>>>>> #10 0x0808e385 in sip_resolvehost ()
> >>>>>>>>> #11 0x0807a26c in mk_proxy ()
> >>>>>>>>> #12 0xb7627d39 in t_relay_to () from
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>> /usr/lib/opensips/modules/tm.so
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>> #13 0xb7634501 in ?? () from /usr/lib/opensips/modules/tm.so
> >>>>>>>>> #14 0x08055030 in do_action ()
> >>>>>>>>> #15 0x08053ebf in run_action_list ()
> >>>>>>>>> #16 0x08095cf2 in eval_expr ()
> >>>>>>>>> #17 0x080958d9 in eval_expr ()
> >>>>>>>>> #18 0x08095919 in eval_expr ()
> >>>>>>>>> #19 0x080554e2 in do_action ()
> >>>>>>>>> #20 0x08053ebf in run_action_list ()
> >>>>>>>>> #21 0x08056e7a in do_action ()
> >>>>>>>>> #22 0x08053ebf in run_action_list ()
> >>>>>>>>> ---Type <return> to continue, or q <return> to quit---
> >>>>>>>>> #23 0x080569d8 in do_action ()
> >>>>>>>>> #24 0x08053ebf in run_action_list ()
> >>>>>>>>> #25 0x08056e7a in do_action ()
> >>>>>>>>> #26 0x08053ebf in run_action_list ()
> >>>>>>>>> #27 0x08057d99 in run_top_route ()
> >>>>>>>>> #28 0x0808ad6c in receive_msg ()
> >>>>>>>>> #29 0x080bd2f2 in udp_rcv_loop ()
> >>>>>>>>> #30 0x08069339 in main ()
> >>>>>>>>> (gdb)
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Best Regards,
> >>>>>>>>> Alex Massover
> >>>>>>>>> VoIP R&D TL
> >>>>>>>>> Jajah Inc.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: users-bounces at lists.opensips.org [mailto:users-
> >>>>>>>>>> bounces at lists.opensips.org] On Behalf Of Andrei Dragus
> >>>>>>>>>> Sent: Wednesday, January 20, 2010 2:58 PM
> >>>>>>>>>> To: OpenSIPS users mailling list
> >>>>>>>>>> Subject: Re: [OpenSIPS-Users] sched_yield()
> >>>>>>>>>>
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> I think that there is a lock that is being held more than it
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>> should
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>> be
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> and that's what causes starvation. It would help us if you
> >>>>>>>>>>
> >> could
> >>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>> attach
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> to a process using gdb and give us a full backtrace.
> >>>>>>>>>>
> >>>>>>>>>> Temporary solutions which should work would be to reduce the
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>> number
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>>> of
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> processes to 4-6 or to recompile replacing -DFAST_LOCK with
> >>>>>>>>>>
> >> one
> >>
> >>>> of
> >>>>
> >>>>
> >>>>>>>>> the
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> other options (-DUSE_POSIX_SEM or -DUSE_PTHREAD_MUTEX) but
> we
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>> should
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>> see
> >>>>>>>>>> where this is from to fix it.
> >>>>>>>>>>
> >>>>>>>>>> Alex Massover wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> Hi!
> >>>>>>>>>>>
> >>>>>>>>>>> Yes, from the source on debian, I build deb package. (I did
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>> some
> >>>>
> >>>>
> >>>>>>>>>> minor changes to the source, but the problem happens also
> >>>>>>>>>>
> >>>>>>>>>>
> >>>> without
> >>>>
> >>>>
> >>>>>>>> my
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>> changes)
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> 16 children on 4 cores.
> >>>>>>>>>>>
> >>>>>>>>>>> What do you suggest to reduce it to 4? It runs on 2.6.32 on
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>> VMware
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>> ESX.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> I'm also trying now sleep(0) instead of sched_yield().
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Best Regards,
> >>>>>>>>>>> Alex Massover
> >>>>>>>>>>> VoIP R&D TL
> >>>>>>>>>>> Jajah Inc.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> -----Original Message-----
> >>>>>>>>>>>> From: users-bounces at lists.opensips.org [mailto:users-
> >>>>>>>>>>>> bounces at lists.opensips.org] On Behalf Of Andrei Dragus
> >>>>>>>>>>>> Sent: Wednesday, January 20, 2010 1:05 PM
> >>>>>>>>>>>> To: OpenSIPS users mailling list
> >>>>>>>>>>>> Subject: Re: [OpenSIPS-Users] sched_yield()
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi Alex,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Are you building OpenSIPS from source?
> >>>>>>>>>>>> How many processes do you have and on how many cores?
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Alex Massover wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hello!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I'm facing a strange problem, sometimes under a stress
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>> OpenSIPS
> >>>>
> >>>>
> >>>>>>>>>>>>> "locks" - load average jumps, SIP processing delays,
> >>>>>>>>>>>>>
> >> opensips
> >>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>> msg
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>>>>> queue fills with a lot of sip messages, opensips
> processes
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>> start
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> to
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>>>> comsume a lot of CPU.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> And strace shows:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> sched_yield()
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> sched_yield()
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> sched_yield()
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> sched_yield()
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> ....
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> for all processes.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> If I stop the stress - after a while (not immediately) -
> it
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>> unlocks,
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>>> also suddenly, I can see in top that all opensips
> processes
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>> stop
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> to
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>>>> consume CPU.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> What can it be? Some kind of starvation?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Alex Massover
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> VoIP R&D TL
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Jajah Inc.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> This mail was sent via Mail-SeCure System.
> >>>>>>>>>>>>> ---------------------------------------------------------
> --
> >>>>>>>>>>>>>
> >> --
> >>
> >>>> --
> >>>>
> >>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> --
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>>> ---
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> _______________________________________________
> >>>>>>>>>>>>> Users mailing list
> >>>>>>>>>>>>> Users at lists.opensips.org
> >>>>>>>>>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>> Andrei Dragus
> >>>>>>>>>>>> www.voice-system.ro
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> _______________________________________________
> >>>>>>>>>>>> Users mailing list
> >>>>>>>>>>>> Users at lists.opensips.org
> >>>>>>>>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>>>>>>>>>
> >>>>>>>>>>>> This mail was received via Mail-SeCure System.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>> This mail was sent via Mail-SeCure System.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> _______________________________________________
> >>>>>>>>>>> Users mailing list
> >>>>>>>>>>> Users at lists.opensips.org
> >>>>>>>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Andrei Dragus
> >>>>>>>>>> www.voice-system.ro
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> Users mailing list
> >>>>>>>>>> Users at lists.opensips.org
> >>>>>>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>>>>>>>
> >>>>>>>>>> This mail was received via Mail-SeCure System.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>> This mail was sent via Mail-SeCure System.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> Users mailing list
> >>>>>>>>> Users at lists.opensips.org
> >>>>>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>>>>>>
> >>>>>>>>> This mail was received via Mail-SeCure System.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>> This mail was sent via Mail-SeCure System.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> Users mailing list
> >>>>>>>> Users at lists.opensips.org
> >>>>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>>>>>
> >>>>>>>> This mail was received via Mail-SeCure System.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>> This mail was sent via Mail-SeCure System.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> Users mailing list
> >>>>>>> Users at lists.opensips.org
> >>>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>> --
> >>>>>> Andrei Dragus
> >>>>>> www.voice-system.ro
> >>>>>>
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> Users mailing list
> >>>>>> Users at lists.opensips.org
> >>>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>>>
> >>>>>> This mail was received via Mail-SeCure System.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>> This mail was sent via Mail-SeCure System.
> >>>>>
> >>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> Users mailing list
> >>>>> Users at lists.opensips.org
> >>>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>>
> >>>>>
> >>>>>
> >>>> --
> >>>> Andrei Dragus
> >>>> www.voice-system.ro
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Users mailing list
> >>>> Users at lists.opensips.org
> >>>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>>
> >>>> This mail was received via Mail-SeCure System.
> >>>>
> >>>>
> >>>>
> >>> This mail was sent via Mail-SeCure System.
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Users mailing list
> >>> Users at lists.opensips.org
> >>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>>
> >>>
> >> --
> >> Andrei Dragus
> >> www.voice-system.ro
> >>
> >>
> >> _______________________________________________
> >> Users mailing list
> >> Users at lists.opensips.org
> >> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >>
> >> This mail was received via Mail-SeCure System.
> >>
> >>
> >
> >
> > This mail was sent via Mail-SeCure System.
> >
> >
> >
> > _______________________________________________
> > Users mailing list
> > Users at lists.opensips.org
> > http://lists.opensips.org/cgi-bin/mailman/listinfo/users
> >
> >
>
>
> --
> Bogdan-Andrei Iancu
> www.voice-system.ro
>
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
> This mail was received via Mail-SeCure System.
>
This mail was sent via Mail-SeCure System.
More information about the Users
mailing list