[OpenSIPS-Users] ERROR:core:io_watch_del: BUG - trying to del fd 38 with flags 2 1

Gupta, Rahul Rahul.Gupta at ipc.com
Thu Aug 6 23:12:50 CEST 2015


Hi Bogdan, I have created an issue with opensips github

https://github.com/OpenSIPS/opensips/issues/591

And uploaded the logs specified in the link with the description along with timestamps.

Thanks
Rahul Gupta

From: Bogdan-Andrei Iancu [mailto:bogdan at opensips.org]
Sent: Tuesday, August 04, 2015 10:51 AM
To: Gupta, Rahul; users at lists.opensips.org
Subject: Re: ERROR:core:io_watch_del: BUG - trying to del fd 38 with flags 2 1

Hi Rahul,

Any logs will be  useful (use some pastebin or file transfer to upload).

On your observations : when a connection timeout does expire (via __tcpconn_lifetime()), reactor_del_all() removes the fd with both IO_WATCH_READ and IO_WATCH_WRITE flags. While in the logs it shows "1", removing with READ only.

Going back


Bogdan-Andrei Iancu

OpenSIPS Founder and Developer

http://www.opensips-solutions.com
On 04.08.2015 16:29, Gupta, Rahul wrote:
Hi Bogdan,

I tried the scenario with 1.11.3 as well as 1.11.5 latest code and could reproduce it consistently. I can turn on the higher debug level logging and provide you the log file. Let me know where can I upload these logs ?

Thanks
Rahul

From: Bogdan-Andrei Iancu [mailto:bogdan at opensips.org]
Sent: Tuesday, August 04, 2015 8:18 AM
To: Gupta, Rahul; users at lists.opensips.org<mailto:users at lists.opensips.org>
Subject: Re: ERROR:core:io_watch_del: BUG - trying to del fd 38 with flags 2 1

Hi Rahul,

I tried to reproduce your scenario during the last 2 days. Thanks to the detailed instructions and stepping, I mananged to get (as described) to the last step. But instead of getting the error you mentioned, the tcp conn timeout event was properly handled:

[23203] DBG:core:__tcpconn_lifetime: timeout for hash=3 - 0x7f94f298fea0 (322 > 317)
[23203] DBG:core:io_watch_del: [TCP_main] io_watch_del op on index -1 29 (0x8828a0, 29, -1, 0x10,0x3) fd_no=22 called

Are you sure you are using the latest GIT code (I tested both 2.1 and 1.11).

Regards,




Bogdan-Andrei Iancu

OpenSIPS Founder and Developer

http://www.opensips-solutions.com
On 31.07.2015 17:40, Gupta, Rahul wrote:
Hi Bogdan, thanks for looking into this issue. Let me know if anyway I can help to get to the bottom of this issue, like providing logs etc

Thanks
Rahul

From: Bogdan-Andrei Iancu [mailto:bogdan at opensips.org]
Sent: Tuesday, July 28, 2015 1:13 PM
To: Gupta, Rahul; users at lists.opensips.org<mailto:users at lists.opensips.org>
Subject: Re: ERROR:core:io_watch_del: BUG - trying to del fd 38 with flags 2 1

Hi Rahul,

These are great news - that you find a way to reproduce it. Let me do it and start debugging.

I really do appreciate your effort in dealing with this error.

Regards,




Bogdan-Andrei Iancu

OpenSIPS Founder and Developer

http://www.opensips-solutions.com
On 24.07.2015 00:19, Gupta, Rahul wrote:
Hi Bogdan,

Almost a month ago, I have raised the issue regarding the opensips log file filling up with the following message

ERROR:core:io_watch_del: BUG - trying to del fd 38 with flags 2 1

Now I can reproduce it every time with the following scenario:


1)      UAC and UAS are connected to opensips proxy with TCP transport

2)      Connections looks good

3)      Netstat shows the connection to UAC in ESTABLISHED state and the tcp_conn_lists of opensips also looks fine for that UAC

4)      Now unplug the Ethernet cable on UAC

5)      After tcp connection timeout (set to 5 mins in opensips.cfg), the tcp connection goes away from netstat as well as from tcp_conn_lists

6)      UAS tries another tcp call to the UAC which is still unplugged

7)      tcp_conn_lists shows the tcp connection to the UAC and netstat shows the connection in SYN_SENT state

8)      After the tcp connection timeout (set to 5 mins in opensips.cfg), the connection goes away from netstat however it remains there in tcp_conn_lists and at that moment the "BUG - trying to del fd 38 with flags 2 1" starts printing in infinite loop.


I looked at the source code and observed the following


1)      When the network cable is plugged in

The io_watch_add happens with flag IO_WATCH_READ in tcp_main.c  when the command is CONN_NEW

2)      When the network cable is unplugged

The io_watch_add happens with flag IO_WATCH_WRITE in tcp_main.c  when the command is ASYNC_CONNECT


3)      While doing io_watch_del after timeout, from handle_tcpconn_ev method, io_watch_del always uses IO_WATCH_READ to delete the fd, which gives this error in case of unplugged cable.

Please look into this and suggest how can I fix this issue ?

Thanks
Rahul Gupta






------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

DISCLAIMER: This e-mail may contain information that is confidential, privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. Unintended recipients are prohibited from taking action on the basis of information in this e-mail.E-mail messages may contain computer viruses or other defects, may not be accurately replicated on other systems, or may be intercepted, deleted or interfered with without the knowledge of the sender or the intended recipient. If you are not comfortable with the risks associated with e-mail messages, you may decide not to use e-mail to communicate with IPC. IPC reserves the right, to the extent and under circumstances permitted by applicable law, to retain, monitor and intercept e-mail messages to and from its systems.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20150806/3bc38c32/attachment-0001.htm>


More information about the Users mailing list