[OpenSIPS-Users] Fine tuning high CPS and msyql queries
venefax at gmail.com
Wed Jun 10 22:59:06 EST 2020
I do 3000+ CPS with Opensips and MySQL using unixODBC, no problem. My
query is for routing only. I read a 280 MM table (RocksDB) for every call.
So it's comparable.
It seems to work flawlessly. However, I only use $60.000 servers from Dell,
R920s, with 64 physical cores and 120 threads, plus 1.5 TB of RAM. My
Opensips is under Vmware ESX 6.X.
Honestly, I paid an Opensips guru to assemble the application for me.
I designed the logic base on my previous switch for Asterisk that could not
handle the pressure. Basically all routing is done in MariaDB. Opensips
just asks a question using a Stored Procedure.
The brain is the database, and opensips executes instructions. It just
On Wed, Jun 10, 2020 at 5:15 PM Jon Abrams <ffshoh at gmail.com> wrote:
> I built a similar functioning platform back in 2015 based on similar
> hardware (Westmere Xeons, hyperthreading enabled) running bare metal on
> Centos6. At some point we bumped it up to dual X5670s (cheap upgrade
> nowadays), but it was handling 12000 CPS peaks on 1 server with 3000-5000
> CPS sustained for large parts of the day. I don't think you are too far off
> in hardware.
> This was on version 1.9, so there was no Async. IIRC it was either 32 or
> 64 children. Async requires the TM module which adds additional overhead
> and memory allocation.
> The LRN database was stored in mysql with a very simple table (TN, LRN) to
> keep memory usage down so that it could be pinned in memory (server had 48
> or 72GB I think). MySQL was set to dump the innodb buffer cache to disk on
> restart so that the whole database would be back in memory on restart.
> Doing a full table scan would initially populate the MySQL cache.
> Blacklists and other smaller datasets were stored in OpenSIPs using the
> userblacklist module. There are better ways to do that in version 2 and
> onwards. Bigger list were stored in memcached. I prefer redis for this
> purpose now.
> I would suggest simplifying testing by using a single MySQL server and
> bypassing the F5 to eliminate that as a source of connection problems or
> additional latencies.
> In the OpenSIPs script, eliminate everything but 1 dip, probably just dip
> the LRN to start.
> Performance test the stripped down scenario with sipp. Based on past
> experience, you should be able to hit or come close to your performance
> goal with only 1 dip in play.
> If you do hit your performance targets, keep adding more dips one by one
> until it breaks.
> If you can't reach your performance target with this stripped down
> scenario, then I'd suggest testing without the async and transactions
> enabled. I wouldn't think transactions would be a necessity in this
> scenario. I ran into CPS problems on that other open source SIP server when
> using async under heavy load. The transaction creation was chewing up CPU
> and memory. I'm not sure how different the implementation is here.
> I seem to start having problems with sipp when I hit a few thousand CPS
> due to it being single threaded. You probably will need to run multiple
> parallel sipp processes for your load test, if not already.
> If using an OS with systemd journald for logging, that will be a big
> bottleneck in of itself with even small amounts of logging.
> In 1.9, I hacked together a module to create a timestamp string with ms
> for logging query latencies for diagnostic purposes. There may be a better
> out of the box way to do it now.
> For children sizing, I would suggest benchmarking with at least 16
> children and then doubling it to compare performance.
> Watch the logs for out of memory or defragment messages and bump up shared
> memory or package memory if necessary. Package memory is probably going to
> be your problem, but it doesn't sound like it is a problem yet.
> Jon Abrams
> On Wed, Jun 10, 2020 at 3:20 PM Calvin Ellison <calvin.ellison at voxox.com>
>> We've checked our F5 BigIP configuration and added a second database
>> server to the pool. Both DBs have been checked for max connections, open
>> files, etc. Memcached has been moved to a dedicated server. Using a SIPp
>> scenario for load testing from a separate host, things seem to fall apart
>> on OpenSIPS around 3,000 CPS with every CPU core at or near 100% and no
>> logs indicating fallback to sync/blocking mode. Both databases barely
>> noticed the few hundred connections. Does this seem reasonable for a dual
>> CPU server with 8 cores and 16 threads?
>> What is the OpenSIPS opinion on Hyper-Threading?
>> Is there a way to estimate max CPS based on SPECrate, BogoMIPS, or some
>> other metric?
>> I would love to know if my opensips.cfg has any mistakes, omissions, or
>> inefficiencies. Is there a person or group who does sanity checks?
>> What should I be looking at within OpenSIPS during a load test to
>> identify bottlenecks?
>> I'm still looking for guidance on the things below, especially children
>> vs timer_partitions:
>> Is there an established method for fine-tuning these things?
>>> shared memory
>>> process memory
>>> listen=... use_children
>>> modparam("tm", "timer_partitions", ?)
>> What else is worth considering?
>> Calvin Ellison
>> Senior Voice Operations Engineer
>> calvin.ellison at voxox.com
>> On Thu, Jun 4, 2020 at 5:18 PM David Villasmil <
>> david.villasmil.work at gmail.com> wrote:
>> > Maybe you are hitting the max connections? How many connections are
>> there when it starts to show those errors?
>> Users mailing list
>> Users at lists.opensips.org
> Users mailing list
> Users at lists.opensips.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Users