[OpenSIPS-Users] Fine tuning high CPS and msyql queries

Sat Jun 13 01:42:03 EST 2020

On 6/12/20 9:11 PM, Calvin Ellison wrote:

> You suggested to "Monitor your receive queue scrupulously at a very high 
> timing resolution". How do I do this?

If using pre-systemd systems, e.g. EL6:

# netstat --inet -n -l | grep 5060

If it's systemd and beyond -- I'm sure Ubuntu Server 18 is, though I 
have no experience with it:

# ss -4nl | grep 5060

Example:

---
[root at allegro-1 ~]# ss -4nl | grep 5060
udp    UNCONN     0      0      10.150.20.5:5060                  *:* 

udp    UNCONN     0      0      10.150.20.2:5060                  *:* 

udp    UNCONN     0      0      209.51.167.66:5060                  *:* 

tcp    LISTEN     0      128    10.150.20.2:5060                  *:* 

---

The third column there (all-0s) is the RecvQ, as can be gleaned from the 
header this command outputs:

----
Netid  State      Recv-Q Send-Q Local Address:Port               Peer 
Address:Port
---

For `netstat`, it would be the second column.

To monitor it at a low interval, for example 200 ms (5 times per sec), 
you could do something like:

---
#!/bin/bash

while : ; do
	echo -n "$(date +"%T.%3N"): "
	ss -4nl | grep 5060 | head -1 | awk '{print $3}'
	sleep 0.2
done
---

That should give you some idea of where the value sits in general.

> You propose there is a pathological issue and the increased buffer size 
> is masking it. How do I determine what that issue is?

Without knowing what your exact routing workflow is, I can't say.

However, 99.9% of the time, it occurs in blocking queries to databases 
or other data sources.

> I've asked repeatedly about children, shared memory, process 
> memory, timer_partitions, etc. but the only answers have been "try 
> more". I've been trying more and less of these things two weeks and 
> changing the buffers was the only thing that appeared to have any 
> immediate impact. How do I know when enough is enough versus too much?

I wrote this article several years ago for Kamailio, but the same basic 
considerations apply to OpenSIPS:

http://www.evaristesys.com/blog/tuning-kamailio-for-high-throughput-and-performance/

"Try more" is definitely not the answer except in cases where the 
workload is overwhelmingly network I/O-bound and/or database-bound. 
Otherwise, the most natural course of action would be to spawn a 
functionally infinite number of children. However, children create 
context switches, contend with each other for CPU time (less a concern 
if most of the workload is waiting on blocking external I/O) and fight 
for various global shared memory structures and locks (still a concern 
regardless). So, there is a point of diminishing returns for any given 
workload. All other things being equal, as per the article, the 
reasonable number of child processes is equal to the number of available 
CPU threads (in /proc/cpuinfo). This number can be increased if the 
workload is very I/O-bound, but only to a point. It's hard to say 
exactly what that point is, and it does have to be empirically 
determined, but I would not run more than 2 * (CPU threads).

> Note, there have been no memory-related log messages. The 16-thread 
> servers have 48GB RAM and the 8-thread servers have 16GB. I'm happy to 
> give all that to OpenSIPS once I know the right way to carve it up.

I see no rationality in giving it all to OpenSIPS.

It's worth bearing in mind that there are two kinds of memory allocations:

- Shared memory, used by the system for global/system-wide data 
constructs, such as transaction memory, dialog state, etc.;

- Package memory, memory that is private to each process and used for 
handling the immediate message. That means every child process 
pre-allocates the package memory requested, so this value should of 
course be much, much smaller than your shared memory pool size.

But still, when you consider all the data that OpenSIPS needs to keep in 
the course of call processing, a lot of it is ephemeral and 
transaction-associated. Once the call is set up, the INVITE transaction 
is disposed. Other call state may add up to a few kilobytes per call at 
most (notwithstanding page sizes and blocks in the underlying 
allocator), but nothing on the order of gigabytes upon gigabytes. 
Assuming 4 KB per call and 200,000 concurrent calls, that's ~800 MB, and 
that is a very generous assumption indeed.

-- Alex

-- 
Alex Balashov | Principal | Evariste Systems LLC

Tel: +1-706-510-6800 / +1-800-250-5920 (toll-free)
Web: http://www.evaristesys.com/, http://www.csrpswitch.com/