[OpenSIPS-Users] Fine tuning high CPS and msyql queries

Saint Michael venefax at gmail.com
Thu Jun 4 21:54:53 EST 2020


Calvin, feel free to login to the container and change the config for
MariaDB, it is located in a single file /etc/my.cnf.
I use TokuDB, a new engine that is better than InnoDB. Lately, I shifted to
RocksDB, which is even better, designed by Facebook.
I have not updated that box because: "if it ain't broke, don't fix it".
But I am open to change the engine in the second container, if you
so desire..
MariaDB is better than MySQL because it gives us a pool of threads,
something that MySQL only gives to paying customers. So I think it should
work.


On Thu, Jun 4, 2020 at 3:44 PM Jon Abrams <ffshoh at gmail.com> wrote:

> A) Is the LRN database located locally on the OpenSIPs box or is it remote?
> B) Have you tried only doing sync database queries? Async introduces some
> overhead, and I'm not sure if it causes extra database connections to be
> created. When using sync there is a connection per child process that stays
> up.
> C) Does the database have enough memory to contain the LRN and DNC
> datasets fully in memory? The extra latency for the non-cache hits sent to
> the database may stack up if the database has to hit disk.
> D) How many child processes are you using now? If you are hitting 100% you
> may need to increase them.
> E) Are your memcached processes using heavy cpu? If you are caching
> multiple lists, I've found it helps to use unique memcached instance per
> list.
> F) Look for memory related log messages. If the memory starts getting
> exhausted you will see defrag messages. This will chew up available
> computation cycles.
>
> - Jon Abrams
>
>
> On Thu, Jun 4, 2020 at 2:17 PM Calvin Ellison <calvin.ellison at voxox.com>
> wrote:
>
>> The scenario is INVITE -> MySQL query -> non-200 final response. No
>> calls are connected here, only dipping things like LRN, Do Not Call,
>> and Wireless/Landline. A similar service runs on a second port,
>> specific to a different kind of traffic and dip. We're using async
>> avp_db_query and memcached, with about 3:1 cache hits.
>>
>> Our target is up to 10,000 CPS across two opensips servers, which are
>> dual-CPU Xeon E5620 with 48G RAM. Both are run memcached, and both
>> servers are using both memcached to share a distributed cache thanks
>> to this:
>>
>> 'modparam("cachedb_memcached","cachedb_url","memcached:lrn://lrn-d,lrn-e/")'.
>> At a glance there are over 200mil total cached items, distributed
>> nearly equally.
>>
>> The issue is that individual child processes start getting suck at
>> 100% CPU. Logs indicate connection failures to the MySQL database
>> causing children to run in sync mode, and there are warnings about
>> delayed timer jobs tm-timer and blcore-expire. Eventually, the service
>> becomes unresponsive. Restarting opensips restores service and the
>> children return to single-digit CPU utilization, but eventually,
>> children get suck again.
>>
>> I'm not certain if the issue is on the database server, or if the
>> opensips servers are overloaded, or if the config is just not right
>> yet.
>>
>> Is there an established method for fine-tuning these things?
>> shared memory
>> process memory
>> children
>> db_max_async_connections
>> listen=... use_children
>> modparam("tm", "timer_partitions", ?)
>>
>> What else is worth considering?
>> Does a child ever return to async mode after running in sync mode?
>> How do I know when my servers have reached their limit?
>> opensips.cfg is available on request.
>>
>> version: opensips 2.4.7 (x86_64/linux)
>> flags: STATS: On, DISABLE_NAGLE, USE_MCAST, SHM_MMAP, PKG_MALLOC,
>> F_MALLOC, FAST_LOCK-ADAPTIVE_WAIT
>> ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16,
>> MAX_URI_SIZE 1024, BUF_SIZE 65535
>> poll method support: poll, epoll, sigio_rt, select.
>> git revision: 9e1fcc915
>> main.c compiled on  with gcc 7
>>
>> *re-built using dpkg-buildpackage including the patch to support DB
>> floating point types:
>> https://opensips.org/pipermail/users/2020-March/042528.html
>>
>> $ lsb_release -d
>> Description:    Ubuntu 18.04.4 LTS
>>
>> $ uname -a
>> Linux TC-521 4.15.0-91-generic #92-Ubuntu SMP Fri Feb 28 11:09:48 UTC
>> 2020 x86_64 x86_64 x86_64 GNU/Linux
>>
>> $ free -mw
>>               total        used        free      shared     buffers
>>    cache   available
>> Mem:          48281        1085         337          87        1729
>>    45128       46551
>>
>> $ lscpu
>> Architecture:        x86_64
>> CPU op-mode(s):      32-bit, 64-bit
>> Byte Order:          Little Endian
>> CPU(s):              16
>> On-line CPU(s) list: 0-15
>> Thread(s) per core:  2
>> Core(s) per socket:  4
>> Socket(s):           2
>> NUMA node(s):        2
>> Vendor ID:           GenuineIntel
>> CPU family:          6
>> Model:               44
>> Model name:          Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
>> Stepping:            2
>> CPU MHz:             2527.029
>> BogoMIPS:            4788.05
>> Virtualization:      VT-x
>> L1d cache:           32K
>> L1i cache:           32K
>> L2 cache:            256K
>> L3 cache:            12288K
>> NUMA node0 CPU(s):   0,2,4,6,8,10,12,14
>> NUMA node1 CPU(s):   1,3,5,7,9,11,13,15
>> Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr
>> pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
>> syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts
>> rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq
>> dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca
>> sse4_1 sse4_2 popcnt aes lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow
>> vnmi flexpriority ept vpid dtherm ida arat flush_l1d
>>
>> Regards,
>>
>> Calvin Ellison
>> Senior Voice Operations Engineer
>> calvin.ellison at voxox.com
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.opensips.org
>> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20200604/8b648a6b/attachment-0001.html>


More information about the Users mailing list