[OpenSIPS-Users] opensips HA resource script (for Heartbeat)
Iñaki Baz Castillo
ibc at aliax.net
Tue Dec 28 11:59:21 CET 2010
2010/12/28 Alexandr A. Alexandrov <shurrman at gmail.com>:
> Hi, All.
>
> This is an issue of writing a correct script, nothing more. :-)
> There are several possibilities, strating from simple process lookup (like
> pgrep -f opensips), ending using MI from such a script.
No, this is a bug in opensips itself since, when running daemonized,
the process returns 0 even if the daemonized (main) process fails to
start (due to any module configuration error).
Any exotic check you add after executing the binary is just a
workaround. Any service/daemon MUST return an accurate exit status
code, so other applications (i.e. HA) can rely on such a value.
>> This makes OpenSIPS not valid for full HA environment, so be careful.
>
> I will make my opensips valid
Can I ask how? Imagine you "dbaliases" module access to a different
database, and such database server is "protected" with iptables
dropping any incoming TCP connection.
You run opensips and the module "dbaliases" tries to establish the
connection with the BD server. It could take LONG time until it raises
a timeout error (maybe minutes). After such time the main process
dies, but before such moment the main process was still running. If
your "valid" init/LSB/OCF script checks the process status 5 seconds
after calling the binay, it would return SUCCESS status (while in
fact, opensips will die soon). No perfect workaround here. The daemon
itself MUST return a real and accurate code.
NOTE: A way to improve it (in OpenSIPS code):
When invoking "opensips", the parent process opens a PIPE for reading,
and the daemonized process open it for writting. The parent process
waits until the daemonized process writes into the PIPE (it writes its
status which is the status code returned by the parent process). This
is already implemented in Kamailio/SIP-router.
Regards.
--
Iñaki Baz Castillo
<ibc at aliax.net>
More information about the Users
mailing list