[OpenSIPS-Users] Opensips, avpops and avp_db_query( ), and pgpool2 -- questions, suggestions, issues

Bogdan-Andrei Iancu bogdan at voice-system.ro
Thu Apr 23 18:17:46 CEST 2009


It is strange, as many people are using the postgres, but is the first 
time for such a report ....

Have you manage to get some support from the postgres project about this 
(whne not using pgpool) ?

Regards,
Bogdan

Bobby Smith wrote:
> From the pgpool2 forums, and after digesting the log files between postgres, pgpool2, and opensips, I was wondering if this is a possibility?  It seems only to happen cyclically every few hours as described below.
>
>
> Also, I'll work on getting you the gdb trace when the child gets stuck on the query again.  For now, we have a workable "work around", but in general it's not pretty and involves a quick scripted restart/failover on opensips. 
>
>
> Thanks,
>
> Bobby
>
>
> Hi,
>
> From: Adrian Moisey <[EMAIL PROTECTED]>
> Subject: [Pgpool-general] disconnects
> Date: Thu, 03 Jan 2008 10:07:02 +0200
>
> > We have a single postgres server that was under load.  To try help with 
>
> > the load I stuck pgpool in front of postgres (using connection caching)
> > 
> > But every few hours everything seems to go pear shaped.  pgpool seems to 
> > get "Connection reset by peer" and postgres gets some errors and kills 
>
> > all its sessions then starts again.
> > 
> > Here are the logs:
> > 
> > Jan  3 08:00:29 ips01 pgpool: 2008-01-03 08:00:29 ERROR: pid 8117: 
> > pool_read: read failed (Connection reset by peer)
>
> > Jan  3 08:00:29 ips01 pgpool: 2008-01-03 08:00:29 ERROR: pid 8117: 
> > ProcessFrontendResponse: failed to read kind from frontend. fronend 
> > abnormally exited
>
> It seems that a client program call exit() before doing PQfinish().
>
> Do you use persistent connection (pconnect() in PHP)?
>
> Regards,
> --
> Yoshiyuki Asaba
> [EMAIL PROTECTED
>
>
> On Tue, Apr 21, 2009 at 4:16 AM, Bogdan-Andrei Iancu 
> <bogdan at voice-system.ro <mailto:bogdan at voice-system.ro>> wrote:
>
>     Hi Bobby,
>
>     you mean the current process is stucked in something.. I suspect
>     in  PQgetResult() trying to read the result. Can you attach with
>     gdb to one of this blocked processes and get the backtrace (to see
>     where they are blocked) ?
>
>     Regards,
>     Bogdan
>
>     Bobby Smith wrote:
>
>         Removing pgpool from the equation and simply doing a direct
>         postgres connection, it seems like the issue still occurs.  If
>         opensips is idle on the avpops db for some time (sounds like
>         about 30 minutes), and you try to execute an arbitrary query
>         to that connection from the routing script, it just keeps
>         passing to the next child without returning data from the
>         query until it runs out of children to execute the process,
>         and so it just kind of "hangs".
>
>         This is with maximum logging enabled.  You can see where the
>         first process attempts to parse the message.  When we get to
>         the db lookup, the task immediately goes to the next process
>         without actually having the first process stop, and it hangs
>         on the "SELECT" statement.  I can at this point however
>         directly connect to my postgres backend and do the query manually.
>
>         This repeats for all the SIP receiving processes, until none
>         are left.  At this point, the proxy stops responding to any
>         messages.
>
>         Is there some sort of connection timeout like parameter
>         involving avp_ops that I'm missing?  It seems like I can
>         troubleshoot so far that a) if the proxy is never idle, this
>         doesn't seem to happen, and b) if the proxy is idle
>         (processing 0 messages), it happens after about 30 minutes and
>         requires a restart of opensips to function correctly.
>
>
>
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_msg: SIP Request:
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_msg:  method:  <INVITE>
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_msg:  uri:    
>         <sip:+13129576830 at 1.2.3.4:5060;transport=udp>
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_msg:  version: <SIP/2.0>
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_headers: flags=2
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_via_param: found param type 232, <branch> =
>         <z9hG4bK4dbf.e8b09c62.0>; state=16
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_via: end of header reached, state=5
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_headers: via found, flags=2
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_headers: this is the first via
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:receive_msg: After parse_msg...
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:receive_msg: preparing to run routing scripts...
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Starting
>         processing for method INVITE
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_headers: flags=100
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_via_param: found param type 232, <branch> =
>         <z9hG4bK4dbf.2d03d153.0>; state=16
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_via: end of header reached, state=5
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_headers: via found, flags=100
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_headers: parse_headers: this is the second via
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_via_param: found param type 232, <branch> =
>         <z9hG4bK506071629460-1207609494153>; state=16
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_via: end of header reached, state=5
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_headers: via found, flags=100
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_to: end of header reached, state=10
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:parse_to: display={},
>         ruri={sip:+13129576830 at 4.79.212.229:5060
>         <http://sip:+13129576830@4.79.212.229:5060>
>         <http://sip:+13129576830@4.79.212.229:5060>}
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:get_hdr_field: <To> [38];
>         uri=[sip:+13129576830 at 4.79.212.229:5060
>         <http://sip:+13129576830@4.79.212.229:5060>
>         <http://sip:+13129576830@4.79.212.229:5060>]
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:get_hdr_field: to body
>         [<sip:+13129576830 at 4.79.212.229:5060
>         <http://sip:+13129576830@4.79.212.229:5060>
>         <http://sip:+13129576830@4.79.212.229:5060>>#015#012]
>
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:get_hdr_field: cseq <CSeq>: <1> <INVITE>
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:maxfwd:is_maxfwd_present: value = 67
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Entering
>         rewrite-ruri block: ruri-user is +13129576830
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: regex
>         search turned up true!!
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: request
>         user: 13129576830
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: this is the
>         phone number: 13129576830
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Original
>         User is +13129576830
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]: Is the avp
>         phone_number still set? 13129576830
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:avpops:ops_dbquery_avps: query [SELECT account_id FROM
>         customer.phone_numbers WHERE phone_number = '13129576830']
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:db_postgres:db_postgres_submit_query: 0x76e170
>         PQsendQuery(SELECT account_id FROM customer.phone_numbers
>         WHERE phone_number = '13129576830')
>         Apr 20 02:54:30 opensipsibp /sbin/opensips[22832]:
>         DBG:core:db_new_result: allocate 48 bytes for result set at
>         0x776640
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_msg: SIP Request:
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_msg:  method:  <INVITE>
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_msg:  uri:    
>         <sip:+13129576830 at 1.2.3.4:5060;transport=udp>
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_msg:  version: <SIP/2.0>
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_headers: flags=2
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_via_param: found param type 232, <branch> =
>         <z9hG4bK4dbf.e8b09c62.0>; state=16
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_via: end of header reached, state=5
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_headers: via found, flags=2
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_headers: this is the first via
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:receive_msg: After parse_msg...
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:receive_msg: preparing to run routing scripts...
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]: Starting
>         processing for method INVITE
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_headers: flags=100
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_via_param: found param type 232, <branch> =
>         <z9hG4bK4dbf.2d03d153.0>; state=16
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_via: end of header reached, state=5
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_headers: via found, flags=100
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_headers: parse_headers: this is the second via
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_via_param: found param type 232, <branch> =
>         <z9hG4bK506071629460-1207609494153>; state=16
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_via: end of header reached, state=5
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_headers: via found, flags=100
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_to: end of header reached, state=10
>         Apr 20 02:54:31 opensipsibp /sbin/opensips[22833]:
>         DBG:core:parse_to: display={},
>         ruri={sip:+13129576830 at 4.79.212.229:5060
>         <http://sip:+13129576830@4.79.212.229:5060>
>         <http://sip:+13129576830@4.79.212.229:5060>}
>
>
>
>
>
>         On Fri, Apr 17, 2009 at 5:32 PM, Bobby Smith
>         <bobby.smith at gmail.com <mailto:bobby.smith at gmail.com>
>         <mailto:bobby.smith at gmail.com <mailto:bobby.smith at gmail.com>>>
>         wrote:
>
>            An example of what happens -- on processing a call right
>         before a
>            database lookup (the select query below in the previous
>         message):
>
>            00:00:00 pgpool: opensips my_database localhost(46362) SELECT
>            postgres   445   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46383) SELECT
>            postgres   446   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   447   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   448   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   449   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   450   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   451   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46366) SELECT
>            postgres   452   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46371) SELECT
>            postgres   453   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   454   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46367) SELECT
>            postgres   455   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46372) SELECT
>            postgres   456   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46393) idle
>            postgres   457   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46391) SELECT
>            postgres   458   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46377) SELECT
>            postgres   459   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46378) SELECT
>            postgres   460   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46374) SELECT
>            postgres   461   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   462   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46387) SELECT
>            postgres   463   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   464   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46380) SELECT
>            postgres   465   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   466   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46385) SELECT
>            postgres   468   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46389) SELECT
>            postgres   469   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46361) SELECT
>            postgres   471   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   473   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   474   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   475   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   476   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   477   438  0 09:00 ?        00:00:00 pgpool:
>         wait for
>            connection request
>            postgres   478   438  0 09:00 ?        00:00:00 pgpool:
>         opensips
>            my_database localhost(46365) SELECT
>
>            I have 32 "available" connections, 16 of which are occupied by
>            opensips children processes.  On this one query, it looks like
>            "every" opensips process does the same processing to *all* the
>            pgpool connections.  This is just for a single INVITE.  I don't
>            know if the database lookup is failing or not, but it looks
>         like
>            the next child processes the message until none are left?
>
>            It will kind of just float here in this state for some time
>         now --
>            I've got a heartbeat script that can come back and clean
>         this up
>            (transparent to the endusers as we're in a failover setup),
>         but I
>            would like to know how to go about troubleshooting this
>         particular
>            type of issue.
>
>            I know it's not the database dying or pgpool losing it's
>            connections, as I have logging enabled on both.
>
>            Any suggestions on troubleshooting, or has anyone tried
>         this sort
>            of configuration before?
>
>            Thanks much in advance.
>
>
>            On Thu, Apr 16, 2009 at 4:39 PM, Bobby Smith
>            <bobby.smith at gmail.com <mailto:bobby.smith at gmail.com>
>         <mailto:bobby.smith at gmail.com <mailto:bobby.smith at gmail.com>>>
>         wrote:
>
>                All,
>
>                We're currently (successfully) testing a configuration of
>                Opensips as a stateless proxy.  I wanted to be able to
>         execute
>                an arbitrary database query to a database that's NOT the
>                opensips database, when I realized the power of the
>                avp_db_query( ) function in avpops.  The database is a
>         backend
>                postgres database in an basic failover configuration
>         (16 max
>                connections cached, primary db, failover to secondary
>         db for
>                those connections).
>
>                Currently, opensips has 16 children processes.  In the
>                configuration, we have the following:
>
>                modparam("avpops", "db_url",
>                "postgres://opensips:opensips@localhost:9999/my_database")
>                modparam("avpops", "avp_table", "")
>
>                A few of the issues I'm concerned about or experiencing
>         with
>                the operation of it.
>
>                a) Why is that second parameter necessary (avp_table)?  It
>                feels like if I have to set this that it's always going to
>                keep open connections with the database, and indeed, it
>         is.        When I ps -ef, it shows I have 16 pgpool
>         connections that are
>                idle to my_database.  The problem that this is causing is
>                that, if for some reason one of the connections has an
>         issue
>                and dies or times out, this happens in the logs:
>
>                Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
>                ERROR:db_postgres:db_postgres_store_result: 0x76e008 -
>         invalid
>                query, execution aborted
>                Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
>                ERROR:db_postgres:db_postgres_store_result: 0x76e008:
>                PGRES_FATAL_ERROR
>                Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
>                ERROR:db_postgres:db_postgres_store_result: 0x76e008:
>         server
>                closed the connection unexpectedly#012#011This probably
>         means
>                the server terminated abnormally#012#011before or while
>                processing the request.#012
>                Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
>                ERROR:core:db_do_raw_query: error while storing result
>                Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]:
>                ERROR:avpops:db_query_avp: cannot do the query
>                Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]: Database
>                Error!  No Lookup!
>                Apr 16 05:46:02 serinbound1 /sbin/opensips[30773]: Query
>                Executed:  Account Number is <null>
>
>                The query I'm executing (from the config):      
>          avp_db_query("SELECT value FROM schema.table WHERE value =
>                '$avp(s:string_a)'", "$avp(s:string_b)");
>
>                When these error messages pop up, pgpool still has active
>                database connections to postgres, but not held by
>         opensips.  I
>                can use one of the pooled connections to connect to the
>                backend database and execute a query, but for some reason
>                opensips cannot.
>
>                And, having a restart work (scripted), I occasionally
>         see the
>                following:
>
>                Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
>                ERROR:db_postgres:db_postgres_new_connection: server closed
>                the connection unexpectedly#012#011This probably means the
>                server terminated abnormally#012#011before or while
>         processing
>                the request.#012
>                Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
>                ERROR:db_postgres:db_postgres_new_connection: cleaning up
>                0x76e090=pkg_free()
>                Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
>                ERROR:core:db_do_init: could not add connection to the pool
>                Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
>                ERROR:avpops:avpops_db_init: cannot initialize database
>         connection
>                Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
>                ERROR:core:init_mod_child: failed to initializing module
>                avpops, rank -1
>                Apr 16 03:59:05 serinbound2 /sbin/opensips[23105]:
>                ERROR:core:start_timer_processes: init_child failed for
>         timer proc
>                Apr 16 03:59:05 serinbound2 /sbin/opensips[23083]:
>                INFO:core:handle_sigs: child process 23105 exited normally,
>                status=255
>                Apr 16 03:59:05 serinbound2 /sbin/opensips[23083]:
>                INFO:core:handle_sigs: terminating due to SIGCHLD
>                Apr 16 03:59:05 serinbound2 /sbin/opensips[23107]:
>                INFO:core:sig_usr: signal 15 received
>
>                At this point, the application crashes.
>
>                Any suggestions or workarounds for this?  More
>         specifically,
>                I'd like opensips to not have to grab the DB connection
>         if it
>                doesn't need it at that time performing a lookup (as
>         i'm not
>                really using avp's in the lookup, just to save the
>         results),
>                also, I'd like to see it not crash completely if it
>         loses that
>                connection.  Instead, just send a error message back
>         and allow
>                me to insert a statement like:
>
>                               if(! what i expect the value of the
>         saved avp
>                from teh database query to be)
>                                {
>                                        sl_send_reply("500", "Internal
>         Server
>                Error");
>                                        exit;
>                                }
>
>                Thanks for your help, I know this was fairly detailed but
>                without much in terms of logging.
>
>
>
>
>         ------------------------------------------------------------------------
>
>         _______________________________________________
>         Users mailing list
>         Users at lists.opensips.org <mailto:Users at lists.opensips.org>
>         http://lists.opensips.org/cgi-bin/mailman/listinfo/users
>          
>
>
>




More information about the Users mailing list