[OpenSIPS-Devel] Question on tm timers handling

Mon Apr 4 05:16:58 UTC 2022

Hello all,

The tm module handles all it's internal timers via two handlers:
 - timer_routine (second based timers)
 - utimer_routine (100ms based timers)
Each of these routines handles 4 different timers each.
Both routines are very similar in functionality and there is no timer
that is handled by both routines.
Because both routines are protected by the same lock
(timertable[(long)set].ex_lock), these two routines cannot run in
parallel (assuming that we have only one set, i.e. a single
timer_partition).

In my testing, I noticed that the tm_utimer routine has difficulties
running smoothly.
After doing more testing and some profiling, it looks like the culprit
is the WT_TIMER.
For around 10-15K records in the WT_TIMER detached timer list, we
spend around 3ms to create the list and 200-300ms to
run_handler_for_each. Because of this, the tm_utimer (which is
scheduled to run every 100ms) is blocked by the lock on the first run
and on the second run the scheduler detects that the previous run is
still running (waiting for the lock) and therefore issues the famous
"already scheduled" warning.

The check_and_split_time_list function has its own locks and then each
handlers operates on its own list (with locks for dealing with cells),
so why do we have the timertable[(long)set].ex_lock?

I removed the lock, tested with one single timer_partition, then with
two timer_partitions and the performance increased dramatically. Is
there a reason for keeping this lock or is it something that was
inherited and nobody bothered to check why and remove it?

Thanks,
Ovidiu

-- 
VoIP Embedded, Inc.
http://www.voipembedded.com