[OpenSIPS-Users] mi_fifo lock on reply after a period of time in 3.1.3
Andrew Yager
andrew at rwts.com.au
Thu Oct 7 22:43:52 EST 2021
Interestingly, where I usually see a range of continued messages from
a process continually in the debug log, they appear to stop for this
PID at 3:47am, and that process seems blocked on a tcp/tls send:
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:core:get_dummy_sip_msg:
reusing the static sip msg 0x7f6fc6d859a0
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:core:release_dummy_sip_msg:
cleaning the static sip msg 0x7f6fc6d859a0
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:core:destroy_avp_list:
destroying list (nil)
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:usrloc:update_ucontact:
exists callback for type= UL_CONTACT_UPDATE
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:usrloc:run_ul_callbacks:
contact=0x7f6ec95a3540, callback type 2/15, id 0 entered
Oct 8 03:47:39 hvprxy osips[2639896]:
DBG:mid_registrar:mid_reg_ct_event: Contact callback (2):
contact='sip:[snip]:50375;transport=tls'
Oct 8 03:47:39 hvprxy osips[2639896]:
DBG:mid_registrar:build_contact: building contact ...
Oct 8 03:47:39 hvprxy osips[2639896]:
DBG:mid_registrar:build_contact: created Contact HF: Contact:
<sip:[snip]:50375;transport=tls>;expires=59#015#012
Oct 8 03:47:39 hvprxy osips[2639896]:
DBG:mid_registrar:mid_reg_resp_in: got ptr back: 0x7f6ed4c4b9c0
Oct 8 03:47:39 hvprxy osips[2639896]:
DBG:mid_registrar:mid_reg_resp_in: RESPONSE FORWARDED TO caller!
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:tm:t_should_relay_response:
T_code=0, new_code=200
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:tm:relay_reply: T_state=4,
branch=0, save=0, relay=0, cancel_BM=0
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:core:parse_headers: flags=2000
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:core:parse_headers:
flags=ffffffffffffffff
Oct 8 03:47:39 hvprxy osips[2639896]:
DBG:core:build_res_buf_from_sip_res: old size: 585, new size: 483
Oct 8 03:47:39 hvprxy osips[2639896]:
DBG:core:build_res_buf_from_sip_res: copied size: orig:583, new: 483,
rest: 2 msg=[snip]
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:tm:insert_timer_unsafe:
[2]: 0x7f6ed4bd82b0 (12697)
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:core:tcp_conn_get: con
found in state 0
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:core:tcp_conn_get: tcp
connection found (0x7f6ec95e6988), acquiring fd
Oct 8 03:47:39 hvprxy osips[2639896]: DBG:core:tcp_conn_get: c=
0x7f6ec95e6988, n=16, Usock=643
[no more output for this pid in log]
root at hvprxy:~# gdb opensips 2639896
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
[snip]
(gdb) bt full
#0 0x00007f6fe6d49057 in __libc_recvmsg (fd=fd at entry=643,
msg=msg at entry=0x7ffed767fd20, flags=flags at entry=256) at
../sysdeps/unix/sysv/linux/recvmsg.c:28
resultvar = 18446744073709551104
__arg3 = <optimized out>
_a2 = <optimized out>
sc_ret = <optimized out>
__arg1 = <optimized out>
_a3 = <optimized out>
sc_cancel_oldtype = <optimized out>
resultvar = <optimized out>
resultvar = <optimized out>
__arg2 = <optimized out>
_a1 = <optimized out>
#1 0x000055b7abde3fac in receive_fd
(unix_socket=unix_socket at entry=643, data=data at entry=0x7ffed767fe20,
data_len=data_len at entry=8, fd=fd at entry=0x7ffed767fe1c,
flags=flags at entry=256) at net/tcp_passfd.c:213
msg = {msg_name = 0x0, msg_namelen = 0, msg_iov =
0x7ffed767fd60, msg_iovlen = 1, msg_control = 0x7ffed767fd70,
msg_controllen = 24, msg_flags = 0}
iov = {{iov_base = 0x7ffed767fe20, iov_len = 8}}
new_fd = <optimized out>
ret = <optimized out>
n = <optimized out>
cmsg = <optimized out>
control_un = {cm = {cmsg_len = 519691042825, cmsg_level = 5,
cmsg_type = 280, __cmsg_data = 0x7ffed767fd80 "\001"},
control =
"\t\000\000\000y\000\000\000\005\000\000\000\030\001\000\000\001\000\000\000\267U\000"}
__FUNCTION__ = "receive_fd"
#2 0x000055b7abdd9fa3 in tcp_conn_get (id=<optimized out>,
ip=ip at entry=0x7ffed767ff00, port=port at entry=50375,
proto=proto at entry=PROTO_TLS,
proto_extra_id=proto_extra_id at entry=0x0,
conn=conn at entry=0x7ffed767fef0, conn_fd=0x7ffed767feec,
send_sock=0x7f6fc6b4a420) at net/net_tcp.c:529
c = 0x7f6ec95e6988
tmp = <optimized out>
a = <optimized out>
hash = <optimized out>
response = {140113801537928, 1}
part = <optimized out>
n = <optimized out>
fd = 0
__FUNCTION__ = "tcp_conn_get"
#3 0x00007f6ec3b98159 in proto_tls_send (send_sock=0x7f6fc6b4a420,
buf=0x7f6fc6d85428 "SIP/2.0 200 OK\r\nv: [snip]r"..., len=483,
to=0x7f6ed4bd8350, id=<optimized out>) at proto_tls.c:490
c = 0x55b7abed0230 <__compound_literal.0>
dom = 0x0
ip = {af = 2, len = 4, u = {addrl = {9385250162, 0}, addr32 =
{795315570, 2, 0, 0}, addr16 = {36210, 12135, 2, 0, 0, 0, 0, 0},
addr = "r\215g/\002\000\000\000\000\000\000\000\000\000\000"}}
port = <optimized out>
fd = 21943
n = <optimized out>
rlen = <optimized out>
__FUNCTION__ = "proto_tls_send"
#4 0x00007f6ec5ccd3af in msg_send (msg=0x0, len=<optimized out>,
buf=0x7f6fc6d85428 "SIP/2.0 200 OK\r\nv: SIP/2.0/TLS
192.168.86.60:50375;rport=50375;[snip]"..., id=<optimized out>,
to=0x7f6ed4bd8350, proto=3, send_sock=0x7f6fc6b4a420) at
../../forward.h:123
out_buff = {
s = 0x7f6fc6d85428 "SIP/2.0 200 OK\r\nv: SIP/2.0/TLS
192.168.86.60:50375;rport=50375;received=[snip]"..., len = 483}
port = <optimized out>
ip = <optimized out>
out_buff = <optimized out>
port = <optimized out>
ip = <optimized out>
__FUNCTION__ = "msg_send"
__ip = <optimized out>
#5 send_pr_buffer (rb=rb at entry=0x7f6ed4bd8330,
buf=buf at entry=0x7f6fc6d85428, len=<optimized out>, ctx=ctx at entry=0x0)
at t_funcs.c:68
--Type <RET> for more, q to quit, c to continue without paging--
__FUNCTION__ = "send_pr_buffer"
#6 0x00007f6ec5cc3eef in relay_reply (t=0x7f6ed4bd8230,
p_msg=<optimized out>, branch=<optimized out>, msg_status=<optimized
out>, cancel_bitmap=0x7ffed7680144)
at t_reply.c:1364
relay = 0
save_clone = 0
buf = 0x7f6fc6d85428 "SIP/2.0 200 OK\r\nv: SIP/2.0/TLS
192.168.86.60:50375;rport=50375;received=[snip]"...
res_len = 483
relayed_code = 200
relayed_msg = 0x7f6fc6d83c00
bm = {to_tag_val = {s = 0x55b7abfd9808 <global_avps> "", len =
-976256698}}
totag_retr = 0
reply_status = RPS_COMPLETED
uas_rb = 0x7f6ed4bd8330
cb_s = {s = 0x7f6fc6d83c00 "\f", len = -681049792}
text = {s = 0x55b7abec1e50 <log_level> "ȬB\311n\177", len = 0}
__FUNCTION__ = "relay_reply"
[snip]
--
Andrew Yager, CEO (BCompSc, JNCIS-SP, MACS (Snr) CP)
business nbn™ advisor (advisor 01783150)
Real World Technology Solutions - IT People you can trust
Voice | Data | IT Procurement | Managed IT
rwts.com.au | 1300 798 718
Real World is a DellEMC Gold Partner
This document should be read only by those persons to whom it is
addressed and its content is not intended for use by any other
persons. If you have received this message in error, please notify us
immediately. Please also destroy and delete the message from your
computer. Any unauthorised form of reproduction of this message is
strictly prohibited. We are not liable for the proper and complete
transmission of the information contained in this communication, nor
for any delay in its receipt. Please consider the environment before
printing this e-mail.
On Fri, 8 Oct 2021 at 09:34, Andrew Yager <andrew at rwts.com.au> wrote:
>
> Hi,
>
> Procs had restarted; but repeated the exercise and it's pid 48 this time.
>
> (gdb) bt full
> #0 0x00007f6fe6d48297 in __libc_write (fd=fd at entry=203,
> buf=buf at entry=0x7ffed7680050, nbytes=nbytes at entry=24) at
> ../sysdeps/unix/sysv/linux/write.c:26
> resultvar = 18446744073709551104
> __arg3 = <optimized out>
> _a2 = <optimized out>
> sc_ret = <optimized out>
> __arg1 = <optimized out>
> _a3 = <optimized out>
> sc_cancel_oldtype = <optimized out>
> resultvar = <optimized out>
> resultvar = <optimized out>
> __arg2 = <optimized out>
> _a1 = <optimized out>
> #1 0x000055b7abca8d33 in __ipc_send_job (payload2=0x0,
> payload1=0x55b7abc6e760 <rpc_get_pkg_stats>, type=0, fd=203) at
> ipc.c:171
> job = {snd_proc = 1, handler_type = 0, payload1 =
> 0x55b7abc6e760 <rpc_get_pkg_stats>, payload2 = 0x0}
> n = <optimized out>
> job = <optimized out>
> n = <optimized out>
> __FUNCTION__ = "__ipc_send_job"
> #2 ipc_send_rpc (dst_proc=dst_proc at entry=48,
> rpc=rpc at entry=0x55b7abc6e760 <rpc_get_pkg_stats>,
> param=param at entry=0x0) at ipc.c:194
> [snip]
> ---
>
> (gdb) p pt[48]
> $1 = {pid = 2639896, type = TYPE_UDP, pg_filter = 0x7f6fc6b4a930, desc
> = "SIP receiver udp:1.1.1.1:5060", '\000' <repeats 92 times>, flags =
> 228, ipc_pipe = {202,
> 203}, ipc_pipe_holder = {202, 203}, ipc_sync_pipe = {204, 205},
> ipc_sync_pipe_holder = {204, 205}, tcp_socks_holder = {642, 643},
> unix_sock = 642, log_level = 4,
> default_log_level = 4, load_rt = 0x7f6ec9493f10, load_1m =
> 0x7f6ec9493fd0, load_10m = 0x7f6ec9494090, pkg_total = 0x7f6ec94d2048,
> pkg_used = 0x7f6ec94d20f8,
> pkg_rused = 0x7f6ec94d3b80, pkg_mused = 0x7f6ec94d3c38, pkg_free =
> 0x7f6ec94d2e00, pkg_frags = 0x7f6ec94d8050, load = {ST_window = {0
> <repeats 415 times>, 509,
> 1000 <repeats 37 times>, 996, 1000 <repeats 109 times>, 996,
> 1000 <repeats 12 times>, 997, 1000, 1000, 1000, 1000, 1000, 1000,
> 1000, 1000, 997,
> 1000 <repeats 32 times>, 996, 1000 <repeats 13 times>, 997, 1000
> <repeats 60 times>, 997, 1000, 1000, 1000, 1000, 1000, 997, 1000
> <repeats 13 times>, 967, 0, 0, 0,
> 0, 0, 255, 151, 0, 0, 0, 0, 0, 0, 0, 0, 527, 1000, 906, 0
> <repeats 268 times>}, LT_window = {0 <repeats 459 times>, 301288, 0
> <repeats 140 times>},
> last_time = 1633625259739989, is_busy = 1 '\001'}}
>
> Andrew
>
> --
> Andrew Yager, CEO (BCompSc, JNCIS-SP, MACS (Snr) CP)
> business nbn™ advisor (advisor 01783150)
> Real World Technology Solutions - IT People you can trust
> Voice | Data | IT Procurement | Managed IT
> rwts.com.au | 1300 798 718
>
>
> Real World is a DellEMC Gold Partner
>
> This document should be read only by those persons to whom it is
> addressed and its content is not intended for use by any other
> persons. If you have received this message in error, please notify us
> immediately. Please also destroy and delete the message from your
> computer. Any unauthorised form of reproduction of this message is
> strictly prohibited. We are not liable for the proper and complete
> transmission of the information contained in this communication, nor
> for any delay in its receipt. Please consider the environment before
> printing this e-mail.
>
>
> On Fri, 8 Oct 2021 at 01:15, Bogdan-Andrei Iancu <bogdan at opensips.org> wrote:
> >
> > OK, getting closer :). Let's see who is the proc ID 46.
> >
> > In GDB, just do
> >
> > > p pt[46]
> >
> > Regards,
> >
> > Bogdan-Andrei Iancu
> >
> > OpenSIPS Founder and Developer
> > https://www.opensips-solutions.com
> > OpenSIPS eBootcamp 2021
> > https://opensips.org/training/OpenSIPS_eBootcamp_2021/
> >
> > On 10/7/21 3:37 PM, Andrew Yager wrote:
> >
> > Hi Bogdan-Andrei,
> >
> > OK; my non expert opinion is that these two lines in ipc.c may be a pointer to a "symptom" of the problem…
> >
> > // FIXME - we should check if the destination process really listens
> > // for read, otherwise we may end up filling in the pipe and block
> >
> > but, I really want to know why it's blocking on the IPC comms before that.
> >
> > Here's the BT.
> >
> > (gdb) bt full
> > #0 0x00007f1e4ce9b297 in __libc_write (fd=fd at entry=195, buf=buf at entry=0x7fffc86261b0, nbytes=nbytes at entry=24) at ../sysdeps/unix/sysv/linux/write.c:26
> > resultvar = 18446744073709551104
> > __arg3 = <optimized out>
> > _a2 = <optimized out>
> > sc_ret = <optimized out>
> > __arg1 = <optimized out>
> > _a3 = <optimized out>
> > sc_cancel_oldtype = <optimized out>
> > resultvar = <optimized out>
> > resultvar = <optimized out>
> > __arg2 = <optimized out>
> > _a1 = <optimized out>
> > #1 0x000055ddea5cad33 in __ipc_send_job (payload2=0x0, payload1=0x55ddea590760 <rpc_get_pkg_stats>, type=0, fd=195) at ipc.c:171
> > job = {snd_proc = 1, handler_type = 0, payload1 = 0x55ddea590760 <rpc_get_pkg_stats>, payload2 = 0x0}
> > n = <optimized out>
> > job = <optimized out>
> > n = <optimized out>
> > __FUNCTION__ = "__ipc_send_job"
> > #2 ipc_send_rpc (dst_proc=dst_proc at entry=46, rpc=rpc at entry=0x55ddea590760 <rpc_get_pkg_stats>, param=param at entry=0x0) at ipc.c:194
> >
> >
More information about the Users
mailing list