[OpenSIPS-Devel] [OpenSIPS/opensips] 28e00f: core: Clean up outdated SIGKILL-based shutdown cod...

Bogdan Andrei IANCU bogdan at opensips.org
Mon Jul 23 10:42:15 EDT 2018


  Branch: refs/heads/master
  Home:   https://github.com/OpenSIPS/opensips
  Commit: 28e00f74ffde8ec855c27abac2d41e2d40cc5e61
      https://github.com/OpenSIPS/opensips/commit/28e00f74ffde8ec855c27abac2d41e2d40cc5e61
  Author: Liviu Chircu <liviu at opensips.org>
  Date:   2018-07-03 (Tue, 03 Jul 2018)

  Changed paths:
    M main.c

  Log Message:
  -----------
  core: Clean up outdated SIGKILL-based shutdown code

When not in "debug" mode (i.e. when opensips successfully daemonizes),
the kill_all_children(SIGKILL) function will actually send SIGKILL
to the entire process group ID, including its owner.  In this case, all
code that follows after it is _dead_ code.

This includes the final cleanup() sequence, which would be desirable to
run even during an emergency shutdown (e.g. OpenSIPS deadlocked,
SIGKILL'ed all its children, yet managed to do a last-second DB dump of
all dialogs and contacts).

This patch fixes the following:

    * rewrite kill_all_children() to actually do what it says: "only
      kill the children".  No need to send kill() to the pgid.

    * remove the alarm()+wait() logic around kill_all_children(SIGKILL).
      Even if opensips is not daemonized (so the SIGKILL only gets sent
      to each enumerated process), if, by absurd, we failed to count all
      processes within the for() loop of the first call, installing a
      signal handler for yet another kill_all_children(SIGKILL) call is
      pretty much useless -- we will still stay stuck in while(wait()).

    * remove the "now unused" sig_alarm_kill() function

    * improve function docs


  Commit: 9c5b1b9503af2e1b76ca9e8f4c5b9b187697e895
      https://github.com/OpenSIPS/opensips/commit/9c5b1b9503af2e1b76ca9e8f4c5b9b187697e895
  Author: Liviu Chircu <liviu at opensips.org>
  Date:   2018-07-04 (Wed, 04 Jul 2018)

  Changed paths:
    M config.h
    M main.c
    M pt.h

  Log Message:
  -----------
  core: Greatly reduce chance of truncated corefiles

Before commit 8073d4de8ed, the SIGTERM signal delivered by the attendant
process to a process who had already crashed and would be dumping its
memory inside the SIGSEGV handler would stay queued and not disrupt the
generation of the corefile.  At worst, corefile generation would take
above "shutdown_time" seconds (60), and OpenSIPS would SIGKILL its
entire process group and definitively terminate.

Now, OpenSIPS behavior has changed to:
    * attendant broadcasts a graceful IPC shutdown job
    * attendant gives _everyone_ a grand total of 5 seconds to finish
    * if not ready -> SIGKILL the entire instance

This behavior is too restrictive for the average spinning disk
performance.  Coupled with the fact that SHM pools of 2+ GB are common
nowadays AND the fact that, occasionally, corefiles come in multiples, 5
seconds is not a lot of time.

This patch fixes the behavior to:
    * attendant broadcasts a graceful IPC shutdown job
    * attendant gives non-crashed processes 5 seconds to obey
    * if not ready -> SIGKILL any processes which are still running
    * perform cleanup(60 sec) (leave the core-dumping processes alone)
    * exit or abort cleanup (leave the core-dumping processes alone)

Thanks to Bogdan for significantly directing the design of this patch


  Commit: 5e154cb8f9ccb5b1ac4679d8342eb9718a636cf4
      https://github.com/OpenSIPS/opensips/commit/5e154cb8f9ccb5b1ac4679d8342eb9718a636cf4
  Author: Liviu Chircu <liviu at opensips.org>
  Date:   2018-07-04 (Wed, 04 Jul 2018)

  Changed paths:
    M packaging/debian/opensips.service
    M packaging/redhat_fedora/opensips.service

  Log Message:
  -----------
  systemd: Properly shut down OpenSIPS 2.4+ instances

OpenSIPS 2.4 is meant to be shut down gracefully, in order to minimize
any form of data corruption caused by partial processing of SIP
messages.  Running "killall opensips" has a good chance of deadlocking
OpenSIPS for the next 45 seconds, especially if it's doing a lot of
logging.  This is often the case with the current systemd logic.

This patch updates the opensips systemd service files so they only
deliver SIGTERM to the attendant process.


  Commit: 78a9ba6f193b3de25c6a6a948de02daa5752d853
      https://github.com/OpenSIPS/opensips/commit/78a9ba6f193b3de25c6a6a948de02daa5752d853
  Author: Bogdan Andrei IANCU <bogdan at opensips.org>
  Date:   2018-07-23 (Mon, 23 Jul 2018)

  Changed paths:
    M config.h
    M main.c
    M packaging/debian/opensips.service
    M packaging/redhat_fedora/opensips.service
    M pt.h

  Log Message:
  -----------
  Merge pull request #1380 from liviuchircu/bugfix/fix-corefile-generation

Bugfix/fix corefile generation


Compare: https://github.com/OpenSIPS/opensips/compare/058cc22cb55d...78a9ba6f193b
      **NOTE:** This service been marked for deprecation: https://developer.github.com/changes/2018-04-25-github-services-deprecation/

      Functionality will be removed from GitHub.com on January 31st, 2019.


More information about the Devel mailing list