[OpenSIPS-Users] mi_fifo lock on reply after a period of time in 3.1.3
Bogdan-Andrei Iancu
bogdan at opensips.org
Thu Oct 7 06:25:30 EST 2021
Hi Andrew,
OK, let's try these steps:
1) right after starting opensips, do an "opensips-cli -x mi ps" and
identify the PID of the FIFO process
2) wait for the blocking to happen
3) from a console with root privileges , do "gdb opensips
_PID_OF_FIFO_" and run "bt full"
Best regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
https://www.opensips-solutions.com
OpenSIPS eBootcamp 2021
https://opensips.org/training/OpenSIPS_eBootcamp_2021/
On 10/7/21 2:19 AM, Andrew Yager wrote:
> Hi Bogdan-Andrei,
>
> This does look to be frozen in this state too. I guess GDB is the next
> step.
>
> Andrew
>
>
> On Wed, 6 Oct 2021 at 17:12, Bogdan-Andrei Iancu <bogdan at opensips.org
> <mailto:bogdan at opensips.org>> wrote:
>
> Hi Andrew,
>
> What you can do is, after getting the FIFO blocked, to do a trap
> "opensips-cli trap" to try to see what the FIFO process is doing
> (or trying to do). If the trap does not work, let me know, I will
> give you the instructions on how to directly attache with GDB to
> the process.
>
> Best regards,
>
> Bogdan-Andrei Iancu
>
> OpenSIPS Founder and Developer
> https://www.opensips-solutions.com <https://www.opensips-solutions.com>
> OpenSIPS eBootcamp 2021
> https://opensips.org/training/OpenSIPS_eBootcamp_2021/ <https://opensips.org/training/OpenSIPS_eBootcamp_2021/>
>
> On 10/6/21 12:24 AM, Andrew Yager wrote:
>> Also restarting opensips produces this:
>>
>> Restarting opensips (via systemctl): opensips.serviceERROR:
>> communication exception for 'which' returned: cannot access fifo
>> file /tmp/opensips_fifo: [Errno 32] Broken pipe!
>>
>> Andrew
>>
>>
>> On Wed, 6 Oct 2021 at 08:23, Andrew Yager <andrew at rwts.com.au
>> <mailto:andrew at rwts.com.au>> wrote:
>>
>> Hi,
>>
>> Just replicated this on the 3.2.2 nightly build. Debug logs
>> are enabled, and this is all I can see on the last request
>> that stalled:
>>
>> Oct 6 04:49:32 hvprxy osips[1186130]:
>> DBG:mi_fifo:mi_fifo_callback: running command [{"jsonrpc":
>> "2.0", "id": "5319", "method": "which", "params": []}]
>> Oct 6 04:49:32 hvprxy osips[1186130]:
>> DBG:mi_fifo:mi_fifo_callback: got mi response = [0x558865cc9560]
>> Oct 6 04:49:32 hvprxy osips[1186130]:
>> DBG:mi_fifo:mi_fifo_callback: running command [{"jsonrpc":
>> "2.0", "id": "8377", "method": "get_statistics", "params":
>> [["all"]]}]
>> Oct 6 04:49:32 hvprxy osips[1186133]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186134]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186135]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186136]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186137]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186138]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186139]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186140]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186141]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186142]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186143]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186144]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186145]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186146]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186147]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186148]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186149]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186150]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186151]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186152]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186153]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186154]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186155]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186156]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186157]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186158]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186159]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186160]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186161]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186162]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186163]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186164]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186165]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186166]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186167]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186168]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186169]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186170]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186171]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186172]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186173]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>> Oct 6 04:49:32 hvprxy osips[1186174]:
>> DBG:core:ipc_handle_job: received job type 0[RPC] from process 1
>>
>> Andrew
>>
>> On Mon, 4 Oct 2021 at 22:37, Andrew Yager <andrew at rwts.com.au
>> <mailto:andrew at rwts.com.au>> wrote:
>>
>> Just further to this, I think it's the ul_dump command
>> that seems to cause the issue first.
>>
>> Andrew
>>
>>
>> On Sat, 2 Oct 2021 at 13:05, Andrew Yager
>> <andrew at rwts.com.au <mailto:andrew at rwts.com.au>> wrote:
>>
>> Hi,
>>
>> Not entirely sure where to start digging on this one.
>> On 3.1.3 we've had an issue appear "suddenly" whereby
>> our mi command output seems to "block" and not return
>> any useful data. Restarting opensips processes
>> restores comms.
>>
>> We end up with a huge number of
>> opensips_fifo_reply_\* files in the /tmp directory,
>> but effectively no data is ever written and
>> opensips-cli freezes.
>>
>> We've restarted twice to "resolve" now, but the issue
>> appears to reoccur. We're using the mi get_statistics
>> command to monitor stats/health and uldump to pull
>> some detail about usrloc data every 5 minutes.
>>
>> I don't think I can cause a debug level increase
>> because the MI process doesn't seem to be
>> communicating (at least it doesn't seem to work).
>>
>> opensips 3.1.3-1
>> opensips-cli 0.1~20210707~572d2db-
>> 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 17:29:00
>> UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
>>
>> I can restart again, but if this is going to keep
>> happening, I want to make sure I have something
>> useful in the logs to be able to trace down.
>>
>> Andrew
>>
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opensips.org <mailto:Users at lists.opensips.org>
>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users <http://lists.opensips.org/cgi-bin/mailman/listinfo/users>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20211007/8c802afd/attachment-0001.html>
More information about the Users
mailing list