No subject


Thu Jan 29 11:41:19 CET 2009


status,reason from watchers.. '. However the driver returns only one
column: '1 columns returned from the query'. Most probably the second
column - reason is empty ( because it usually is) but I expected for the
driver to return two columns and the second with a null value. 
The solution seems to be checking for the number of columns returned to
avoid this problem in postgres here. However there are other places where
null columns are in the middle of the queried columns and I wonder what
happens then.. 
I will look after the null column cases in presence and try to avoid this
error.

regards,
Anca


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2009-02-12 18:00

Message:

The issue is as follows:


When you do a query and after that you get context switched to another
thread that does a query, it will execute its own query and see the result
of the previous one. Simple scheme:

Thread A		Thread B
do_query(A)
			do_query(B)
			fetch_result()
...

The result is that thread B crashes, as the results returned are from the
wrong query. I've seen crashes in pretty much every module that uses the DB
connection.

My initial solution was to just add a lock in db_con_t structure and track
it, but that didn't solve the problem, which in turn led me to find that
new connections are initialized in db/db.c in db_do_init(), and there every
new connection is checked if it exists in the pool. So if you use a few
modules with the same database (as I do in my config, I have the same
database/user/pass for uri_db, usrloc, auth_db, presence, presence_xml and
xcap_client), only one entry will be created in the db_pool stuff. This in
turn leads to having multiple db_con_t structures for one DB connection,
which made the lock pretty much useless.

As a hack, I just made the db_do_init() not to check in the pool if a
connection exists, but to just open a new one. This will have a pretty
nasty effect on the number of open database connections (my setup opened
about 70), but at least it shouldn't crash this way. My ugly patch is
available at http://tiki.securax.net/mx/db_lock.diff .

A good solution is to have internal locking per DB connection which in
turn can be accessed with a member of db_func_t from the relevant
functions. I don't seem to have a better idea right now...

-- 
Regards,
Vasil Kolev
Attractel NV
dCAP #1324, LPIC2

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2009-02-12 15:44

Message:
I have been looking into the problem for some time and it seems like
there's a race condition in the postgresql driver somewhere. I have seen
how a query A is made, then query B hits, and then the part that sent B
sees the results of A and crashes opensips.

I'll try to find the place where the lock should happen, but I'll need to
get a bit more familiar with the types of locking primitives used.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2593088&group_id=232389



More information about the Devel mailing list