[OpenSIPS-Devel] [ opensips-Bugs-2226940 ] core dump parsing messages

SourceForge.net noreply at sourceforge.net
Mon Jan 5 10:42:22 CET 2009


Bugs item #2226940, was opened at 2008-11-06 00:43
Message generated for change (Settings changed) made by bogdan_iancu
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2226940&group_id=232389

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: modules
Group: trunk
>Status: Closed
Resolution: Fixed
Priority: 9
Private: No
Submitted By: Richard Revels (rrevels)
Assigned to: Bogdan-Andrei Iancu (bogdan_iancu)
Summary: core dump parsing messages

Initial Comment:
opensips segfault occuring multiple times daily.  Core files available on request.

Core was generated by `/usr/local/opensips/sbin/opensips -f /usr/local/opensips/etc/opensips/opensips.'.
Program terminated with signal 11, Segmentation fault.
#0  free_to (tb=0x820a9d8) at parser/parse_to.c:75
75			foo = tp->next;
(gdb) bt
#0  free_to (tb=0x820a9d8) at parser/parse_to.c:75
#1  0x080de8e0 in clean_hdr_field (hf=0x9864de0c) at parser/hf.c:186
#2  0x002e91b7 in run_trans_callbacks (type=2, trans=0x986bd438, req=0x9864d3d0, rpl=0x820b6f0, code=300) at sip_msg.h:49
#3  0x002f353c in t_reply_matching (p_msg=0x820b6f0, p_branch=0xbfd11764) at t_lookup.c:840
#4  0x002f39dc in t_check (p_msg=0x820b6f0, param_branch=0xbfd11764) at t_lookup.c:911
#5  0x00304136 in reply_received (p_msg=0x820b6f0) at t_reply.c:1288
#6  0x08064e4a in forward_reply (msg=0x820b6f0) at forward.c:507
#7  0x080951b6 in receive_msg (
    buf=0x817a120 "SIP/2.0 300 Multiple choices\r\nVia: SIP/2.0/UDP XXX.XX.XXX.XXX;branch=z9hG4bKe2a7.40d37b23.0,SIP/2.0/UDP XX.XXX.XX.XX:5060;received=XX.XXX.XX.XX;branch=z9hG4bK4abb0821;rport=5060\r\nFrom: \"FOO\" <sip:+1"..., len=893, rcv_info=0xbfd11894) at receive.c:203
#8  0x080d7b37 in udp_rcv_loop () at udp_server.c:449
#9  0x0806d5ce in main (argc=9, argv=0xbfd11a84) at main.c:780



----------------------------------------------------------------------

Comment By: Richard Revels (rrevels)
Date: 2008-12-13 16:09

Message:
Been running for several days with no problem.  Thank you very much for the
effort that went into fixing this bug.

----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2008-12-08 20:52

Message:
Hi Richard,

Thanks for all your support in debugging this - finally I found the bug
and apply a fix on SVN - the fix in incorporated in 1.4.3 release.
Please report.

Thanks and regards,
Bogdan

----------------------------------------------------------------------

Comment By: Richard Revels (rrevels)
Date: 2008-12-02 06:05

Message:
The core dumps appear to occur when accounting is on and message
retransmissions occur.  I have quite a few core files and a log file that
was generated with debug set to level 6 in the config if that would help
narrow down the source of the problem.




----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2008-11-13 15:39

Message:
OK- acc module...
I guess you have some extra_accounting parameters....could you list them?

Bogdan


----------------------------------------------------------------------

Comment By: Richard Revels (rrevels)
Date: 2008-11-13 15:33

Message:
Yeah, that whole pkg memory thing didn't stop the crashes after all.  

I am using acc.  I am also running with the account early media flag off
because I was crashing with it on.  When I turned it off this started
popping up and I wanted to deal with one problem at a time, so left it off.
 

----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2008-11-13 11:20

Message:
Hi Richard,

I'm investigating the initial crash you reported (on reply time). First
question - which of the following modules are you using: nat_traversal,
UAC, siptrace, acc, osp..

The crash happens on the TMCB_RESPONSE_IN callback (in TM) and only the
above modules are using this callback.

Regards,
Bogdan


----------------------------------------------------------------------

Comment By: Richard Revels (rrevels)
Date: 2008-11-12 17:14

Message:
Initial indications are that when I compile with 

#define PKG_MEM_POOL_SIZE 1024*4068

I don't get the core dumps.

When I compile with 

#define PKG_MEM_POOL_SIZE 1024*9216

I get lots of them.  I am running at about 1 call per second which is the
same load the system was at when it was dumping on a regular basis.  The
other change is that I am running with command line option -m 256 vs -m 512
but I suspect the pkg memory change is the significant one. 

-rw------- 1 root root 547639296 Nov  9 15:03 core.32127
-rw------- 1 root root 547602432 Nov  9 15:14 core.5279
-rw------- 1 root root 547745792 Nov  9 21:00 core.7677
-rw------- 1 root root 547614720 Nov  9 22:59 core.22886
-rw------- 1 root root 547745792 Nov 10 19:36 core.18281
-rw------- 1 root root 547655680 Nov 11 06:18 core.15740
-rw------- 1 root root 547749888 Nov 11 19:16 core.4572
-rw------- 1 root root 547745792 Nov 11 21:28 core.22289
-rw------- 1 root root 547614720 Nov 11 21:50 core.21091
-rw------- 1 root root 547614720 Nov 11 22:21 core.25760
-rw------- 1 root root 547614720 Nov 11 22:33 core.310
-rw------- 1 root root 547610624 Nov 11 23:07 core.3837
-rw------- 1 root root 547614720 Nov 12 01:05 core.10964



----------------------------------------------------------------------

Comment By: Richard Revels (rrevels)
Date: 2008-11-11 14:13

Message:
This seems consistent in 50 to 60 percent of the core files.  Looks like
there are two scenarios causing the crashes.  One when replies arrive and
the other on initial invites.  This is from a reply arriving and the
sequence of the function calls in the backtrace match the backtrace used to
open the bug report.

(gdb) print tb
$1 = (struct to_body *) 0x820a9d8
(gdb) print *tb
$2 = {error = 808333871, body = {s = 0x5044552f <Address 0x5044552f out of
bounds>, len = 775370272}, uri = {s = 0x2e323531 <Address 0x2e323531 out of
bounds>, len = 775500850}, 
  display = {s = 0x353a3033 <Address 0x353a3033 out of bounds>, len =
993015344}, tag_value = {s = 0x6e617262 <Address 0x6e617262 out of bounds>,
len = 2050844771}, 
  parsed_uri = {user = {s = 0x34476839 <Address 0x34476839 out of bounds>,
len = 842222434}, passwd = {s = 0x34353865 <Address 0x34353865 out of
bounds>, len = 1916483894}, 
    host = {s = 0x74726f70 <Address 0x74726f70 out of bounds>, len =
909129021}, port = {s = 0x460a0d30 <Address 0x460a0d30 out of bounds>, len
= 980250482}, params = {
      s = 0x73612220 <Address 0x73612220 out of bounds>, len =
1769104756}, headers = {s = 0x20226b73 <Address 0x20226b73 out of bounds>,
len = 1885958972}, port_no = 24890, 
    proto = 29811, type = 1936290405, transport = {s = 0x3736406b <Address
0x3736406b out of bounds>, len = 842346798}, ttl = {s = 0x3934322e <Address
0x3934322e out of bounds>, 
      len = 1043346222}, user_param = {s = 0x6761743b <Address 0x6761743b
out of bounds>, len = 896753981}, maddr = {s = 0x62323339 <Address
0x62323339 out of bounds>, 
      len = 221591651}, method = {s = 0x3a6f540a <Address 0x3a6f540a out
of bounds>, len = 1769159712}, lr = {s = 0x31323a70 <Address 0x31323a70 out
of bounds>, 
      len = 842542646}, r2 = {s = 0x3532322e <Address 0x3532322e out of
bounds>, len = 892547630}, transport_val = {s = 0x61743b3e <Address
0x61743b3e out of bounds>, 
      len = 808533351}, ttl_val = {s = 0x61306665 <Address 0x61306665 out
of bounds>, len = 1680959076}, user_param_val = {s = 0x64633732 <Address
0x64633732 out of bounds>, 
      len = 1664495718}, maddr_val = {s = 0x38656461 <Address 0x38656461
out of bounds>, len = 879112754}, method_val = {s = 0x36343939 <Address
0x36343939 out of bounds>, 
      len = 942564405}, lr_val = {s = 0xd313864 <Address 0xd313864 out of
bounds>, len = 1818313482}, r2_val = {s = 0x44492d6c <Address 0x44492d6c
out of bounds>, 
      len = 1697914938}}, param_lst = 0x30396334, last_param = 0x39336434}



----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2008-11-06 17:40

Message:
Turned debug up to 6 for a while.  Here is a partial log output showing the
process id startup and first few lines from the last time opensips core
dumped.

[root at sips-proxy-01 opensips]# grep 14: /var/log/opensips.log | grep 17235
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=usrloc
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=tm
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:tm:child_init_callid: callid: '7bf8b0b7-17235 at 216.82.225.235'
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=registrar
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=xlog
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:xlog:child_init: init_child [-4]  pid [17235]
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=acc
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=avpops
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=permissions
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=dispatcher
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:dispatcher:child_init:  #-4 / pid <17235>
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=mi_fifo
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=db_flatstore
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=dialog
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=auth_db
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:db_do_init: connection 0x83def70 not found in pool
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:db_postgres:db_postgres_new_connection: db_id = 0x83def70
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:db_postgres:db_postgres_new_connection: 0x83def90=pkg_malloc(36)
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:db_postgres:db_postgres_new_connection: opening connection:
postgres://xxxx:xxxx@localhost:5432/openser13
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:db_postgres:db_postgres_new_connection: PQsetdbLogin(0xa0fd2a8)
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:init_mod_child: type=PROC_TCP_MAIN, rank=-4, module=snmpstats
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 6, 1, 0x819f2a0), fd_no=0
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 7, 4, 0x982dbcec), fd_no=1
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 8, 4, 0x982dbd78), fd_no=2
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 9, 4, 0x982dbe04), fd_no=3
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 10, 4, 0x982dbe90), fd_no=4
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 11, 4, 0x982dbf1c), fd_no=5
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 12, 4, 0x982dbfa8), fd_no=6
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 13, 4, 0x982dc034), fd_no=7
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 16, 4, 0x982dc0c0), fd_no=8
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 18, 4, 0x982dc14c), fd_no=9
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 20, 4, 0x982dc1d8), fd_no=10
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 22, 4, 0x982dc264), fd_no=11
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 14, 3, 0x82083f0), fd_no=12
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 15, 3, 0x8208404), fd_no=13
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 17, 3, 0x8208418), fd_no=14
Nov  6 14:45:33 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_add: io_watch_add(0x8169da0, 19, 3, 0x820842c), fd_no=15
Nov  6 14:48:46 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
CRITICAL:core:receive_fd: EOF on 9
Nov  6 14:48:46 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:handle_ser_child: dead child 3, pid 17224 (shutting down?)
Nov  6 14:48:46 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
DBG:core:io_watch_del: io_watch_del (0x8169da0, 9, -1, 0x0) fd_no=16 called
Nov  6 14:48:46 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
INFO:core:sig_usr: signal 15 received
Nov  6 14:48:46 sips-proxy-01 /usr/local/opensips/sbin/opensips[17235]:
Memory status (pkg):



----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2226940&group_id=232389



More information about the Devel mailing list