[OpenSIPS-Devel] [ opensips-Bugs-3182321 ] segfault in f_malloc.c, chains from dialog module

SourceForge.net noreply at sourceforge.net
Fri Feb 18 18:09:42 CET 2011


Bugs item #3182321, was opened at 2011-02-15 09:54
Message generated for change (Comment added) made by denodaeus
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=3182321&group_id=232389

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: modules
Group: 1.6.x
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: Robert Smith (denodaeus)
Assigned to: Bogdan-Andrei Iancu (bogdan_iancu)
Summary: segfault in f_malloc.c, chains from dialog module

Initial Comment:
We've had several crashes now, R7081 (this is pretty much 1.6.3 pulled the day of release with a couple of bug fix patches in textops and loadbalancer applied):

It seems to be fairly consistently happening around free_dlg_dlg in dlg_hash.c which is triggering it -- we have several cores showing this behavior.

What are the next steps for debugging this?  This is on a production machine and we haven't been able to reproduce it elsewhere, so we are somewhat sensitive to load issues while trying to troubleshoot this.

Thanks.



Program terminated with signal 11, Segmentation fault.
#0  0x00000000004a2bcf in fm_insert_free (qm=0x2b875eba4000, p=<value optimized out>) at mem/f_malloc.c:155
155			(*f)->prev = &(frag->u.nxt_free);
(gdb) bt full
#0  0x00000000004a2bcf in fm_insert_free (qm=0x2b875eba4000, p=<value optimized out>) at mem/f_malloc.c:155
        f = 0x2b875eba4070
        hash = 3
#1  fm_free (qm=0x2b875eba4000, p=<value optimized out>) at mem/f_malloc.c:460
        f = 0x2b875f41be40
        n = <value optimized out>
        __FUNCTION__ = "fm_free"
#2  0x00002b875e14dcde in free_dlg_dlg () at dlg_hash.c:168
        i = 0
#3  destroy_dlg_table () at dlg_hash.c:234
        dlg = 0x2b875f319a80
        i = 5
#4  0x00002b875e13a651 in mod_destroy () at dialog.c:742
No locals.
#5  0x0000000000476ff4 in destroy_modules () at sr_module.c:370
        t = 0x785328
        foo = 0x785258
#6  0x00000000004252e0 in cleanup (show_status=1) at main.c:336
No locals.
#7  0x00000000004261fb in handle_sigs () at main.c:533
        chld = 0
        chld_status = 139
        i = <value optimized out>
        do_exit = 1
        __FUNCTION__ = "handle_sigs"
#8  0x000000000042a5a9 in main_loop (argc=3, argv=0x4) at main.c:913
        i = 4
        pid = <value optimized out>
        si = 0x0
        startup_done = 0x0
        chd_rank = 16
        __FUNCTION__ = "main_loop"
#9  main (argc=3, argv=0x4) at main.c:1388
        cfg_log_stderr = <value optimized out>
        cfg_stream = 0x831e010
        c = <value optimized out>
        r = <value optimized out>
        tmp = 0x4fc447 "H\215\005\262\333#"
        tmp_len = <value optimized out>
        port = <value optimized out>
        proto = <value optimized out>
        ret = <value optimized out>
        seed = 586369539
        rfd = <value optimized out>
        __FUNCTION__ = "main"
(gdb) 


(gdb) frame 0
#0  0x00000000004a2bcf in fm_insert_free (qm=0x2b875eba4000, p=<value optimized out>) at mem/f_malloc.c:155
155			(*f)->prev = &(frag->u.nxt_free);
(gdb) list
150		
151		/*insert it here*/
152		frag->prev = f;
153		frag->u.nxt_free=*f;
154		if( *f )
155			(*f)->prev = &(frag->u.nxt_free);
156	
157		*f=frag;
158		qm->free_hash[hash].no++;
159		free_plus(qm , frag->size);
(gdb) info locals
f = 0x2b875eba4070
hash = 3
(gdb) print *f
$1 = (struct fm_frag *) 0x2b865f235290
(gdb) print *(*f)
Cannot access memory at address 0x2b865f235290
(gdb) frame 1
#1  fm_free (qm=0x2b875eba4000, p=<value optimized out>) at mem/f_malloc.c:460
460		fm_insert_free(qm, f);
(gdb) print f
$2 = (struct fm_frag *) 0x2b875f41be40
(gdb) list
455			goto join;
456		}
457	
458	no_join:
459	
460		fm_insert_free(qm, f);
461	}
462	
463	
464	#ifdef DBG_F_MALLOC
(gdb) 


And a second crash here a few hours after restarting from above:

Core was generated by `/usr/local/opensips/sbin/opensips -u opensips'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000000004a2bcf in fm_insert_free (qm=0x2aca30c14000, p=<value optimized out>) at mem/f_malloc.c:155
155			(*f)->prev = &(frag->u.nxt_free);
(gdb) bt full
#0  0x00000000004a2bcf in fm_insert_free (qm=0x2aca30c14000, p=<value optimized out>) at mem/f_malloc.c:155
        f = 0x2aca30c14070
        hash = 3
#1  fm_free (qm=0x2aca30c14000, p=<value optimized out>) at mem/f_malloc.c:460
        f = 0x2aca315c1718
        n = <value optimized out>
        __FUNCTION__ = "fm_free"
#2  0x00002aca301bdcde in free_dlg_dlg () at dlg_hash.c:168
        i = 0
#3  destroy_dlg_table () at dlg_hash.c:234
        dlg = 0x2aca31692c70
        i = 0
#4  0x00002aca301aa651 in mod_destroy () at dialog.c:742
No locals.
#5  0x0000000000476ff4 in destroy_modules () at sr_module.c:370
        t = 0x785328
        foo = 0x785258
#6  0x00000000004252e0 in cleanup (show_status=1) at main.c:336
No locals.
#7  0x00000000004261fb in handle_sigs () at main.c:533
        chld = 0
        chld_status = 139
        i = <value optimized out>
        do_exit = 1
        __FUNCTION__ = "handle_sigs"
#8  0x000000000042a5a9 in main_loop (argc=3, argv=0x4) at main.c:913
        i = 4
        pid = <value optimized out>
        si = 0x0
        startup_done = 0x0
        chd_rank = 16
        __FUNCTION__ = "main_loop"
#9  main (argc=3, argv=0x4) at main.c:1388
        cfg_log_stderr = <value optimized out>
        cfg_stream = 0x14cfb010
        c = <value optimized out>
        r = <value optimized out>
        tmp = 0x4fc447 "H\215\005\262\333#"


----------------------------------------------------------------------

>Comment By: Robert Smith (denodaeus)
Date: 2011-02-18 12:09

Message:
Hi bogdan -- new crash file is attached, uac_crash.corefile.txt.

Thanks,

----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2011-02-18 09:53

Message:
Hi Robert,

to get both core file,  you can just set the /proc/sys/kernel/core_pattern
to some name which will be unique per process.  
Or just set /proc/sys/kernel/core_uses_pid to '1', so you'll get a
core.PID file for every process that crashed.

And post the backtrace of the first corefile.

Regards,
Bogdan

----------------------------------------------------------------------

Comment By: Robert Smith (denodaeus)
Date: 2011-02-16 04:47

Message:
The shutdown was a sig11 which caused the attendant process to exit on
sig15.  I'll try to look at the root cause, but I originally thought this
was off of the sip receiver process.

----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2011-02-15 15:59

Message:
Hi Robert,

The traces you posted show a crash in the shutdown procedure - can you
check in the opensips logs to see if the shutdown was a normal one or a
result of another crash? I try to exclude the possibility of a double
coredump (one of the opensips procs crashes a generates a cores -> entire
opensips starts to shutdown a crashes again -> second corefile,
overwritting the first one).

Regards,
Bogdan

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=3182321&group_id=232389



More information about the Devel mailing list