[OpenSIPS-Users] Fine tuning high CPS and msyql queries

David Villasmil david.villasmil.work at gmail.com
Fri Jun 5 00:17:18 EST 2020


Maybe you are hitting the max connections? How many connections are there
when it starts to show those errors?

On Fri, 5 Jun 2020 at 01:06, Calvin Ellison <calvin.ellison at voxox.com>
wrote:

> > A) Is the LRN database located locally on the OpenSIPs box or is it
> remote?
>
> We are using an F5 BIG-IP to proxy a pool of database servers.
> Opensips is showing two connection-related errors:
>
> Jun  4 10:41:48 TC-521 /usr/sbin/opensips[12318]:
> ERROR:db_mysql:db_mysql_connect: driver error(2013): Lost connection
> to MySQL server at 'reading authorization packet', system error: 110
> Jun  4 10:41:48 TC-521 /usr/sbin/opensips[12318]:
> ERROR:db_mysql:db_mysql_new_connection: initial connect failed
> Jun  4 10:41:48 TC-521 /usr/sbin/opensips[12318]:
> ERROR:core:db_init_async: failed to open new DB connection on
> mysql://XXXX:XXXX@10.0.5.38:0/
> Jun  4 10:41:48 TC-521 /usr/sbin/opensips[12318]:
> INFO:db_mysql:db_mysql_async_raw_query: Failed to open new connection
> (current: 1 + 8). Running in sync mode!
> Jun  4 10:41:48 TC-521 /usr/sbin/opensips[12318]:
> INFO:db_mysql:switch_state_to_disconnected: disconnect event for
> 0x7f8903f16d10
> Jun  4 10:41:48 TC-521 /usr/sbin/opensips[12318]:
> INFO:db_mysql:reset_all_statements: resetting all statements on
> connection: (0x7f8903f16bb0) 0x7f8903f16d10
> Jun  4 10:41:48 TC-521 /usr/sbin/opensips[12318]:
> INFO:db_mysql:connect_with_retry: re-connected successful for
> 0x7f8903f16d10
>
> Jun  4 10:44:29 TC-521 /usr/sbin/opensips[12342]:
> ERROR:db_mysql:db_mysql_connect: driver error(2003): Can't connect to
> MySQL server on '10.0.5.38' (110)
> Jun  4 10:44:29 TC-521 /usr/sbin/opensips[12342]:
> ERROR:db_mysql:db_mysql_new_connection: initial connect failed
> Jun  4 10:44:29 TC-521 /usr/sbin/opensips[12342]:
> ERROR:core:db_init_async: failed to open new DB connection on
> mysql://XXXX:XXXX@10.0.5.38:0/
> Jun  4 10:44:29 TC-521 /usr/sbin/opensips[12342]:
> INFO:db_mysql:db_mysql_async_raw_query: Failed to open new connection
> (current: 1 + 10). Running in sync mode!
> Jun  4 10:44:29 TC-521 /usr/sbin/opensips[12342]:
> INFO:db_mysql:switch_state_to_disconnected: disconnect event for
> 0x7f8903f16d10
> Jun  4 10:44:29 TC-521 /usr/sbin/opensips[12342]:
> INFO:db_mysql:reset_all_statements: resetting all statements on
> connection: (0x7f8903f16bb0) 0x7f8903f16d10
> Jun  4 10:44:29 TC-521 /usr/sbin/opensips[12342]:
> INFO:db_mysql:connect_with_retry: re-connected successful for
> 0x7f8903f16d10
>
> MariaDB is also showing an error from its perspective:
>
> 2020-06-04 23:40:27 64783 [Warning] Aborted connection 64783 to db:
> 'unconnected' user: 'anonymous' host: '8.38.42.13' (Got timeout
> reading communication packets)
>
> > B) Have you tried only doing sync database queries? Async introduces
> some overhead, and I'm not sure if it causes extra database connections to
> be created. When using sync there is a connection per child process that
> stays up.
>
> Using synchronous mode appeared to be causing context switching issues
> under heavy load. We specifically moved to async for this reason and
> that appeared to reduce the CPU load dramatically. From the docs:
>
> "Using the asynchronous, "suspend-resume" logic instead of forking a
> large number of processes in order to scale also has the advantage of
> optimizing system resource usage, increasing its maximal throughput.
> By requiring less processes to complete the same amount of work in the
> same amount of time, process context switching is minimized and
> overall CPU usage is improved. Less processes will also eat up less
> system memory."
>
> I've been tweaking each of the configuration settings I've mentioned,
> but without any clear path forward. Would 3.x provide any solutions?
>
> Is it possible to have too many children or timer partitions, and
> starve opensips with context switches? Would that cause connection
> issues?
>
> > C) Does the database have enough memory to contain the LRN and DNC
> datasets fully in memory? The extra latency for the non-cache hits sent to
> the database may stack up if the database has to hit disk.
>
> DB says query response time is like 0.001s and doesn't show any sign
> of strain. I'm not personally familiar with the TokuDB engine, but I'm
> lead to believe the entire dataset is in memory. I have two DBA triple
> checking things. It's possible we're hitting a max connections or open
> files limit that's set too low. Sometimes our peak hours include
> spikes as well.
>
> > D) How many child processes are you using now? If you are hitting 100%
> you may need to increase them.
>
> Only one hits 100% initially, then they topple over after that. This
> seems to be related to the intermittent database connection errors.
> We'll see what raising the max connections and ulimits on the server
> does. I've also backed off on children and increased the async
> connection pool size to result in the same number of total maximum
> connections. Presumably this will reduce context switches and timer
> delays.
>
> > E) Are your memcached processes using heavy cpu? If you are caching
> multiple lists, I've found it helps to use unique memcached instance per
> list.
>
> All of the various SIP dips are the same db stored procedure with many
> fields in the response. Those fields are cached as a CSV string, so
> any cached dip can be used by any other kind of dip. The same call is
> likely to use multiple dips, so we should only hit the DB once per
> call regardless of how many different dips we apply.
>
> > F) Look for memory related log messages. If the memory starts getting
> exhausted you will see defrag messages. This will chew up available
> computation cycles.
>
> Both opensips servers and the database have plenty of free memory. How
> do I know how much shared and process memory to use? I see warnings
> about the reactor size shrinking to a percentage of the process memory
> but have no idea what that implies.
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
-- 
Regards,

David Villasmil
email: david.villasmil.work at gmail.com
phone: +34669448337
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.opensips.org/pipermail/users/attachments/20200605/fb1f5577/attachment-0001.html>


More information about the Users mailing list