[OpenSIPS-Users] Opensips, avpops and avp_db_query( ), and pgpool2 -- questions, suggestions, issues
Bogdan-Andrei Iancu
bogdan at voice-system.ro
Thu Apr 23 18:17:46 CEST 2009
It is strange, as many people are using the postgres, but is the first
time for such a report ....
Have you manage to get some support from the postgres project about this
(whne not using pgpool) ?
Regards,
Bogdan
Bobby Smith wrote:
> From the pgpool2 forums, and after digesting the log files between postgres, pgpool2, and opensips, I was wondering if this is a possibility? It seems only to happen cyclically every few hours as described below.
>
>
> Also, I'll work on getting you the gdb trace when the child gets stuck on the query again. For now, we have a workable "work around", but in general it's not pretty and involves a quick scripted restart/failover on opensips.
>
>
> Thanks,
>
> Bobby
>
>
> Hi,
>
> From: Adrian Moisey <[EMAIL PROTECTED]>
> Subject: [Pgpool-general] disconnects
> Date: Thu, 03 Jan 2008 10:07:02 +0200
>
> > We have a single postgres server that was under load. To try help with
>
> > the load I stuck pgpool in front of postgres (using connection caching)
> >
> > But every few hours everything seems to go pear shaped. pgpool seems to
> > get "Connection reset by peer" and postgres gets some errors and kills
>
> > all its sessions then starts again.
> >
> > Here are the logs:
> >
> > Jan 3 08:00:29 ips01 pgpool: 2008-01-03 08:00:29 ERROR: pid 8117:
> > pool_read: read failed (Connection reset by peer)
>
> > Jan 3 08:00:29 ips01 pgpool: 2008-01-03 08:00:29 ERROR: pid 8117:
> > ProcessFrontendResponse: failed to read kind from frontend. fronend
> > abnormally exited
>
> It seems that a client program call exit() before doing PQfinish().
>
> Do you use persistent connection (pconnect() in PHP)?
>
> Regards,
> --
> Yoshiyuki Asaba
> [EMAIL PROTECTED
>
>
> On Tue, Apr 21, 2009 at 4:16 AM, Bogdan-Andrei Iancu
> <bogdan at voice-system.ro <mailto:bogdan at voice-system.ro>> wrote:
>
> Hi Bobby,
>
> you mean the current process is stucked in something.. I suspect
> in PQgetResult() trying to read the result. Can you attach with
> gdb to one of this blocked processes and get the backtrace (to see
> where they are blocked) ?
>
> Regards,
> Bogdan
>
> Bobby Smith wrote:
>
> Removing pgpool from the equation and simply doing a direct
> postgres connection, it seems like the issue still occurs. If
> opensips is idle on the avpops db for some time (sounds like
> about 30 minutes), and you try to execute an arbitrary query
> to that connection from the routing script, it just keeps
> passing to the next child without returning data from the
> query until it runs out of children to execute the process,
> and so it just kind of "hangs".
>
> This is with maximum logging enabled. You can see where the
> first process attempts to parse the message. When we get to
> the db lookup, the task immediately goes to the next process
> without actually having the first process stop, and it hangs
> on the "SELECT" statement. I can at this point however
> directly connect to my postgres backend and do the query manually.
>
> This repeats for all the SIP receiving processes, until none
> are left. At this point, the proxy stops responding to any
> messages.
>
> Is there some sort of connection timeout like parameter
> involving avp_ops that I'm missing? It seems like I can
> troubleshoot so far that a) if the proxy is never idle, this
> doesn't seem to happen, and b) if the proxy is idle
> (processing 0 messages), it happens after about 30 minutes and
> requires a restart of opensips to function correctly.
>
>
>
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_msg: SIP Request:
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_msg: method: <INVITE>
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_msg: uri:
> <sip:+13129576830 at 1.2.3.4:5060;transport=udp>
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_msg: version: <SIP/2.0>
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: flags=2
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_via_param: found param type 232, <branch> =
> <z9hG4bK4dbf.e8b09c62.0>; state=16
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_via: end of header reached, state=5
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: via found, flags=2
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: this is the first via
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:receive_msg: After parse_msg...
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:receive_msg: preparing to run routing scripts...
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Starting
> processing for method INVITE
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: flags=100
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_via_param: found param type 232, <branch> =
> <z9hG4bK4dbf.2d03d153.0>; state=16
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_via: end of header reached, state=5
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: via found, flags=100
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: parse_headers: this is the second via
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_via_param: found param type 232, <branch> =
> <z9hG4bK506071629460-1207609494153>; state=16
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_via: end of header reached, state=5
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_headers: via found, flags=100
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_to: end of header reached, state=10
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:parse_to: display={},
> ruri={sip:+13129576830 at 4.79.212.229:5060
> <http://sip:+13129576830@4.79.212.229:5060>
> <http://sip:+13129576830@4.79.212.229:5060>}
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:get_hdr_field: <To> [38];
> uri=[sip:+13129576830 at 4.79.212.229:5060
> <http://sip:+13129576830@4.79.212.229:5060>
> <http://sip:+13129576830@4.79.212.229:5060>]
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:get_hdr_field: to body
> [<sip:+13129576830 at 4.79.212.229:5060
> <http://sip:+13129576830@4.79.212.229:5060>
> <http://sip:+13129576830@4.79.212.229:5060>>#015#012]
>
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:get_hdr_field: cseq <CSeq>: <1> <INVITE>
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:maxfwd:is_maxfwd_present: value = 67
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Entering
> rewrite-ruri block: ruri-user is +13129576830
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: regex
> search turned up true!!
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: request
> user: 13129576830
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: this is the
> phone number: 13129576830
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Original
> User is +13129576830
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Is the avp
> phone_number still set? 13129576830
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:avpops:ops_dbquery_avps: query [SELECT account_id FROM
> customer.phone_numbers WHERE phone_number = '13129576830']
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:db_postgres:db_postgres_submit_query: 0x76e170
> PQsendQuery(SELECT account_id FROM customer.phone_numbers
> WHERE phone_number = '13129576830')
> Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
> DBG:core:db_new_result: allocate 48 bytes for result set at
> 0x776640
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_msg: SIP Request:
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_msg: method: <INVITE>
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_msg: uri:
> <sip:+13129576830 at 1.2.3.4:5060;transport=udp>
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_msg: version: <SIP/2.0>
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: flags=2
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_via_param: found param type 232, <branch> =
> <z9hG4bK4dbf.e8b09c62.0>; state=16
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_via: end of header reached, state=5
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: via found, flags=2
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: this is the first via
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:receive_msg: After parse_msg...
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:receive_msg: preparing to run routing scripts...
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: Starting
> processing for method INVITE
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: flags=100
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_via_param: found param type 232, <branch> =
> <z9hG4bK4dbf.2d03d153.0>; state=16
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_via: end of header reached, state=5
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: via found, flags=100
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: parse_headers: this is the second via
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_via_param: found param type 232, <branch> =
> <z9hG4bK506071629460-1207609494153>; state=16
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_via: end of header reached, state=5
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_headers: via found, flags=100
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_to: end of header reached, state=10
> Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
> DBG:core:parse_to: display={},
> ruri={sip:+13129576830 at 4.79.212.229:5060
> <http://sip:+13129576830@4.79.212.229:5060>
> <http://sip:+13129576830@4.79.212.229:5060>}
>
>
>
>
>
> On Fri, Apr 17, 2009 at 5:32 PM, Bobby Smith
> <bobby.smith at gmail.com <mailto:bobby.smith at gmail.com>
> <mailto:bobby.smith at gmail.com <mailto:bobby.smith at gmail.com>>>
> wrote:
>
> An example of what happens -- on processing a call right
> before a
> database lookup (the select query below in the previous
> message):
>
> 00:00:00 pgpool: opensips my_database localhost(46362) SELECT
> postgres 445 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46383) SELECT
> postgres 446 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 447 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 448 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 449 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 450 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 451 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46366) SELECT
> postgres 452 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46371) SELECT
> postgres 453 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 454 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46367) SELECT
> postgres 455 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46372) SELECT
> postgres 456 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46393) idle
> postgres 457 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46391) SELECT
> postgres 458 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46377) SELECT
> postgres 459 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46378) SELECT
> postgres 460 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46374) SELECT
> postgres 461 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 462 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46387) SELECT
> postgres 463 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 464 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46380) SELECT
> postgres 465 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 466 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46385) SELECT
> postgres 468 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46389) SELECT
> postgres 469 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46361) SELECT
> postgres 471 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 473 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 474 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 475 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 476 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 477 438 0 09:00 ? 00:00:00 pgpool:
> wait for
> connection request
> postgres 478 438 0 09:00 ? 00:00:00 pgpool:
> opensips
> my_database localhost(46365) SELECT
>
> I have 32 "available" connections, 16 of which are occupied by
> opensips children processes. On this one query, it looks like
> "every" opensips process does the same processing to *all* the
> pgpool connections. This is just for a single INVITE. I don't
> know if the database lookup is failing or not, but it looks
> like
> the next child processes the message until none are left?
>
> It will kind of just float here in this state for some time
> now --
> I've got a heartbeat script that can come back and clean
> this up
> (transparent to the endusers as we're in a failover setup),
> but I
> would like to know how to go about troubleshooting this
> particular
> type of issue.
>
> I know it's not the database dying or pgpool losing it's
> connections, as I have logging enabled on both.
>
> Any suggestions on troubleshooting, or has anyone tried
> this sort
> of configuration before?
>
> Thanks much in advance.
>
>
> On Thu, Apr 16, 2009 at 4:39 PM, Bobby Smith
> <bobby.smith at gmail.com <mailto:bobby.smith at gmail.com>
> <mailto:bobby.smith at gmail.com <mailto:bobby.smith at gmail.com>>>
> wrote:
>
> All,
>
> We're currently (successfully) testing a configuration of
> Opensips as a stateless proxy. I wanted to be able to
> execute
> an arbitrary database query to a database that's NOT the
> opensips database, when I realized the power of the
> avp_db_query( ) function in avpops. The database is a
> backend
> postgres database in an basic failover configuration
> (16 max
> connections cached, primary db, failover to secondary
> db for
> those connections).
>
> Currently, opensips has 16 children processes. In the
> configuration, we have the following:
>
> modparam("avpops", "db_url",
> "postgres://opensips:opensips@localhost:9999/my_database")
> modparam("avpops", "avp_table", "")
>
> A few of the issues I'm concerned about or experiencing
> with
> the operation of it.
>
> a) Why is that second parameter necessary (avp_table)? It
> feels like if I have to set this that it's always going to
> keep open connections with the database, and indeed, it
> is. When I ps -ef, it shows I have 16 pgpool
> connections that are
> idle to my_database. The problem that this is causing is
> that, if for some reason one of the connections has an
> issue
> and dies or times out, this happens in the logs:
>
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
> ERROR:db_postgres:db_postgres_store_result: 0x76e008 -
> invalid
> query, execution aborted
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
> ERROR:db_postgres:db_postgres_store_result: 0x76e008:
> PGRES_FATAL_ERROR
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
> ERROR:db_postgres:db_postgres_store_result: 0x76e008:
> server
> closed the connection unexpectedly#012#011This probably
> means
> the server terminated abnormally#012#011before or while
> processing the request.#012
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
> ERROR:core:db_do_raw_query: error while storing result
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
> ERROR:avpops:db_query_avp: cannot do the query
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]: Database
> Error! No Lookup!
> Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]: Query
> Executed: Account Number is <null>
>
> The query I'm executing (from the config):
> avp_db_query("SELECT value FROM schema.table WHERE value =
> '$avp(s:string_a)'", "$avp(s:string_b)");
>
> When these error messages pop up, pgpool still has active
> database connections to postgres, but not held by
> opensips. I
> can use one of the pooled connections to connect to the
> backend database and execute a query, but for some reason
> opensips cannot.
>
> And, having a restart work (scripted), I occasionally
> see the
> following:
>
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
> ERROR:db_postgres:db_postgres_new_connection: server closed
> the connection unexpectedly#012#011This probably means the
> server terminated abnormally#012#011before or while
> processing
> the request.#012
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
> ERROR:db_postgres:db_postgres_new_connection: cleaning up
> 0x76e090=pkg_free()
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
> ERROR:core:db_do_init: could not add connection to the pool
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
> ERROR:avpops:avpops_db_init: cannot initialize database
> connection
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
> ERROR:core:init_mod_child: failed to initializing module
> avpops, rank -1
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
> ERROR:core:start_timer_processes: init_child failed for
> timer proc
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23083]:
> INFO:core:handle_sigs: child process 23105 exited normally,
> status=255
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23083]:
> INFO:core:handle_sigs: terminating due to SIGCHLD
> Apr 16 03:59:05 serinbound2 /sbin/opensips[23107]:
> INFO:core:sig_usr: signal 15 received
>
> At this point, the application crashes.
>
> Any suggestions or workarounds for this? More
> specifically,
> I'd like opensips to not have to grab the DB connection
> if it
> doesn't need it at that time performing a lookup (as
> i'm not
> really using avp's in the lookup, just to save the
> results),
> also, I'd like to see it not crash completely if it
> loses that
> connection. Instead, just send a error message back
> and allow
> me to insert a statement like:
>
> if(! what i expect the value of the
> saved avp
> from teh database query to be)
> {
> sl_send_reply("500", "Internal
> Server
> Error");
> exit;
> }
>
> Thanks for your help, I know this was fairly detailed but
> without much in terms of logging.
>
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Users mailing list
> Users at lists.opensips.org <mailto:Users at lists.opensips.org>
> http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>
>
>
>
More information about the Users
mailing list