[OpenSIPS-Users] Fine tuning high CPS and msyql queries
Jon Abrams
ffshoh at gmail.com
Wed Jun 10 21:13:30 EST 2020
I built a similar functioning platform back in 2015 based on similar
hardware (Westmere Xeons, hyperthreading enabled) running bare metal on
Centos6. At some point we bumped it up to dual X5670s (cheap upgrade
nowadays), but it was handling 12000 CPS peaks on 1 server with 3000-5000
CPS sustained for large parts of the day. I don't think you are too far off
in hardware.
This was on version 1.9, so there was no Async. IIRC it was either 32 or 64
children. Async requires the TM module which adds additional overhead and
memory allocation.
The LRN database was stored in mysql with a very simple table (TN, LRN) to
keep memory usage down so that it could be pinned in memory (server had 48
or 72GB I think). MySQL was set to dump the innodb buffer cache to disk on
restart so that the whole database would be back in memory on restart.
Doing a full table scan would initially populate the MySQL cache.
Blacklists and other smaller datasets were stored in OpenSIPs using the
userblacklist module. There are better ways to do that in version 2 and
onwards. Bigger list were stored in memcached. I prefer redis for this
purpose now.
I would suggest simplifying testing by using a single MySQL server and
bypassing the F5 to eliminate that as a source of connection problems or
additional latencies.
In the OpenSIPs script, eliminate everything but 1 dip, probably just dip
the LRN to start.
Performance test the stripped down scenario with sipp. Based on past
experience, you should be able to hit or come close to your performance
goal with only 1 dip in play.
If you do hit your performance targets, keep adding more dips one by one
until it breaks.
If you can't reach your performance target with this stripped down
scenario, then I'd suggest testing without the async and transactions
enabled. I wouldn't think transactions would be a necessity in this
scenario. I ran into CPS problems on that other open source SIP server when
using async under heavy load. The transaction creation was chewing up CPU
and memory. I'm not sure how different the implementation is here.
I seem to start having problems with sipp when I hit a few thousand CPS due
to it being single threaded. You probably will need to run multiple
parallel sipp processes for your load test, if not already.
If using an OS with systemd journald for logging, that will be a big
bottleneck in of itself with even small amounts of logging.
In 1.9, I hacked together a module to create a timestamp string with ms for
logging query latencies for diagnostic purposes. There may be a better out
of the box way to do it now.
For children sizing, I would suggest benchmarking with at least 16 children
and then doubling it to compare performance.
Watch the logs for out of memory or defragment messages and bump up shared
memory or package memory if necessary. Package memory is probably going to
be your problem, but it doesn't sound like it is a problem yet.
BR,
Jon Abrams
On Wed, Jun 10, 2020 at 3:20 PM Calvin Ellison <calvin.ellison at voxox.com>
wrote:
> We've checked our F5 BigIP configuration and added a second database
> server to the pool. Both DBs have been checked for max connections, open
> files, etc. Memcached has been moved to a dedicated server. Using a SIPp
> scenario for load testing from a separate host, things seem to fall apart
> on OpenSIPS around 3,000 CPS with every CPU core at or near 100% and no
> logs indicating fallback to sync/blocking mode. Both databases barely
> noticed the few hundred connections. Does this seem reasonable for a dual
> CPU server with 8 cores and 16 threads?
>
>
> https://ark.intel.com/content/www/us/en/ark/products/47925/intel-xeon-processor-e5620-12m-cache-2-40-ghz-5-86-gt-s-intel-qpi.html
>
> What is the OpenSIPS opinion on Hyper-Threading?
>
> Is there a way to estimate max CPS based on SPECrate, BogoMIPS, or some
> other metric?
>
> I would love to know if my opensips.cfg has any mistakes, omissions, or
> inefficiencies. Is there a person or group who does sanity checks?
>
> What should I be looking at within OpenSIPS during a load test to identify
> bottlenecks?
>
> I'm still looking for guidance on the things below, especially children
> vs timer_partitions:
>
> Is there an established method for fine-tuning these things?
>> shared memory
>> process memory
>> children
>> db_max_async_connections
>> listen=... use_children
>> modparam("tm", "timer_partitions", ?)
>
>
> What else is worth considering?
>
> Regards,
>
> Calvin Ellison
> Senior Voice Operations Engineer
> calvin.ellison at voxox.com
>
> On Thu, Jun 4, 2020 at 5:18 PM David Villasmil <
> david.villasmil.work at gmail.com> wrote:
> >
> > Maybe you are hitting the max connections? How many connections are
> there when it starts to show those errors?
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20200610/0c69f3b5/attachment-0001.html>
More information about the Users
mailing list