[OpenSIPS-Devel] [OpenSIPS/opensips] 31352d: clusterer: Make the data sync interface more robust

Liviu Chircu noreply at github.com
Tue Jul 30 10:29:36 EDT 2019


  Branch: refs/heads/3.0
  Home:   https://github.com/OpenSIPS/opensips
  Commit: 31352d982b421f0c2a686d09bedf00a0bd1c44f2
      https://github.com/OpenSIPS/opensips/commit/31352d982b421f0c2a686d09bedf00a0bd1c44f2
  Author: Liviu Chircu <liviu at opensips.org>
  Date:   2019-07-30 (Tue, 30 Jul 2019)

  Changed paths:
    M modules/clusterer/sync.c

  Log Message:
  -----------
  clusterer: Make the data sync interface more robust

This patch improves the data sync interface so that during a sync,
modules are no longer forced to micro-manage the data packets they are
receiving from the interface.  They can now freely abort the processing
of a sync chunk at any time, without disrupting the processing of the
entire sync packet (composed of many more of such data chunks).

Additionally, since the sync packet format has changed (an extra integer
is needed for each chunk in order to allow the "skip" mechanism), the
sync packet version is now bumped from 1 -> 2, in order to prevent any
compatibility issues with OpenSIPS nodes without this patch.

(cherry picked from commit 0b3ad435be73d7ef49c511d2b70039c39a883135)


  Commit: 60bac3c3abc3fb2ba6a51f6659a9ac22a1eafddc
      https://github.com/OpenSIPS/opensips/commit/60bac3c3abc3fb2ba6a51f6659a9ac22a1eafddc
  Author: Liviu Chircu <liviu at opensips.org>
  Date:   2019-07-30 (Tue, 30 Jul 2019)

  Changed paths:
    M bin_interface.c
    M bin_interface.h
    M modules/clusterer/api.h
    M modules/clusterer/clusterer.h
    M modules/clusterer/sync.c
    M modules/clusterer/sync.h
    M modules/dialog/dlg_replication.c
    M modules/usrloc/ul_cluster.c

  Log Message:
  -----------
  clusterer: Enhance the versioning of sync packets

This commit adds an additional "version" field for the sync packets,
which are more complex than the other ones.  Since they contain
serialization logic from two different layers (clusterer + data module),
they should also contain two version fields, to allow each module to
discard data coming from an OpenSIPS donor node running on a differing
binary version.

(cherry picked from commit a20a0acb5d9e8ef75d4cf2bb081ed1d5d259a3dd)


  Commit: 57bbad55c13e3156dcebf95bcd8adf8e52e97ecd
      https://github.com/OpenSIPS/opensips/commit/57bbad55c13e3156dcebf95bcd8adf8e52e97ecd
  Author: Liviu Chircu <liviu at opensips.org>
  Date:   2019-07-30 (Tue, 30 Jul 2019)

  Changed paths:
    M modules/dialog/dlg_replication.c
    M modules/dialog/dlg_replication.h

  Log Message:
  -----------
  dialog replication: Prevent crashes due to differing packet versions

Commits 58dc435cb563 and 852e629e4700 changed the format of the dialog
binary data packets.  This would cause an immediate crash during a
rolling upgrade, since upon upgrading and restarting the backup node, it
would sync from or receive packets from a primary node running the older
version, with the previous data format.

This patch makes it so dialog packets which do not meet the expected
version are simply discarded, rather than being left to cause a crash.


  Commit: 82b87044e7d641ab307c8ceed65fc6c5fa0f710e
      https://github.com/OpenSIPS/opensips/commit/82b87044e7d641ab307c8ceed65fc6c5fa0f710e
  Author: Liviu Chircu <liviu at opensips.org>
  Date:   2019-07-30 (Tue, 30 Jul 2019)

  Changed paths:
    M modules/dialog/dlg_replication.c

  Log Message:
  -----------
  dialog replication: Revert the sync packet alignment code

... since now it is unnecessary, thanks to the sync layer enhancements.

(cherry picked from commit e33565342065298eeb542a73503f84402cc1076d)


  Commit: 765c521e4ba041236f234cc186cc978247783dc7
      https://github.com/OpenSIPS/opensips/commit/765c521e4ba041236f234cc186cc978247783dc7
  Author: Liviu Chircu <liviu at opensips.org>
  Date:   2019-07-30 (Tue, 30 Jul 2019)

  Changed paths:
    M modules/dialog/dlg_replication.c

  Log Message:
  -----------
  dialog sync: Do not include early or ended dialogs

(cherry picked from commit b8bde2f014e5b427c8517493a7ada16923d0928c)


  Commit: a6345cc2d1ff801d51510001df04278b1ae12810
      https://github.com/OpenSIPS/opensips/commit/a6345cc2d1ff801d51510001df04278b1ae12810
  Author: Liviu Chircu <liviu at opensips.org>
  Date:   2019-07-30 (Tue, 30 Jul 2019)

  Changed paths:
    M modules/dialog/dlg_db_handler.c
    M modules/dialog/dlg_hash.c
    M modules/dialog/dlg_hash.h
    M modules/dialog/dlg_replication.c

  Log Message:
  -----------
  dialog: Fix data reload race conditions on startup

Since loading the data on child_init(), the load_dialog_info_from_db()
routines and rcv_cluster_event() routines could run in parallel, without
any synchronization on the dialog table, which could lead to duplicate
dialogs in the hash.

(cherry picked from commit aa93d0fbf369078f1c0e85fc10314fe7799aeca0)


  Commit: a468005019539d32613617e9bc549c1765e00764
      https://github.com/OpenSIPS/opensips/commit/a468005019539d32613617e9bc549c1765e00764
  Author: Liviu Chircu <liviu at opensips.org>
  Date:   2019-07-30 (Tue, 30 Jul 2019)

  Changed paths:
    M modules/dialog/dlg_db_handler.c
    M modules/dialog/dlg_handlers.c
    M modules/dialog/dlg_hash.h
    M modules/dialog/dlg_replication.c
    M modules/dialog/dlg_req_within.h

  Log Message:
  -----------
  dialog: Decrement dialog stats during post-sync cleanup

After a sync completes, it proceeds to clean up all dialogs loaded from
DB which did not match the data received via sync.  However, make sure
to also decrement the 'active' / 'early' dialog stats on each delete!

(cherry picked from commit f88c41064c9d895e52c6a8dac62867246da3b253)


  Commit: ddcdab724f0da3ef1430353acceb8da686c8add1
      https://github.com/OpenSIPS/opensips/commit/ddcdab724f0da3ef1430353acceb8da686c8add1
  Author: Liviu Chircu <liviu at opensips.org>
  Date:   2019-07-30 (Tue, 30 Jul 2019)

  Changed paths:
    M modules/dialog/dlg_replication.c

  Log Message:
  -----------
  dialog sync: Fix ref miscount during post-sync cleanup

The hash reference must only be decremented a single time during the
lifetime of a dialog.  Given that multiple pieces of code may attempt
to delete a dialog concurrently (e.g. a BIN "delete" packet and the
post-sync cleanup routine), the only way to guarantee a single decrement
of the hash ref is by using the dialog state machine transition.

Iff we're the ones to transition from ACK -> DELETED, we can (and MUST)
also decrement the hash reference.

(cherry picked from commit 51b5ec3bac182104781a5e12287a217053170432)


  Commit: ba7516e92f4ba5ecf34234872d49ee2d07750888
      https://github.com/OpenSIPS/opensips/commit/ba7516e92f4ba5ecf34234872d49ee2d07750888
  Author: Liviu Chircu <liviu at opensips.org>
  Date:   2019-07-30 (Tue, 30 Jul 2019)

  Changed paths:
    M modules/dialog/dlg_db_handler.c
    M modules/dialog/dlg_db_handler.h
    M modules/dialog/dlg_replication.c

  Log Message:
  -----------
  dialog: Fix broken re-INVITE pinging after failover

The mandatory re-INVITE pinging data (SDP1, SDP2, ct1, ct2) was not
included in the BIN replication packets, so the pinging would stop
working once we'd failover to the backup box, in an active/backup
HA scenario.

(cherry picked from commit 65a9f51f1ee43e500d6fbe34c3a0e07722bd75b2)


Compare: https://github.com/OpenSIPS/opensips/compare/9e951c904f70...ba7516e92f4b



More information about the Devel mailing list