Real support for Prepared Statements #7

lpsmith · 2011-02-10T03:18:23Z

The HDBC API supports "prepared" statements, however the PostgreSQL currently doesn't do anything with this. Instead, the query string will be sent to the database repeatedly, which means that the database will have to repeatedly re-parse the query and re-create a query plan.

jgoerzen · 2011-02-10T20:25:47Z

You are quite right, which is why I have this comment in Statement.hsc:

-- FIXME: we currently do no prepare optimization whatsoever.

I probably don't have time to work on this right away, but I agree it should be fixed.

soenkehahn · 2012-11-19T13:53:45Z

Hi!

I am interested in working on this. I would like to get my work upstream, so I have some questions about the structure of the current code and how to proceed.

Currently, DB.HDBC.Statement.newSth is used to create new Statements. These Statements don't provide statement preparation. Also, newSth is used in DB.HDBC.Connection to implement frunRaw, frun and fgetTables. These are places where statement preparation should not be done. So I thought, the first step towards fixing this ticket would be to untangle the two different cases (that is, using newSth for unprepared and prepared Statements). My fork shows some preliminary work on this: soenkehahn@2d070dc
Currently the fork does not behave differently, but the functions for prepared statements and unprepared statements are split up.

Here are my questions:

Is this generally the right way to do this? Did I misunderstand something?
What do you think about the module renaming/refactoring?
I would think that in the unprepared case we shouldn't use values of type Statement internally. If I understand correctly this type is meant for prepared statements and its very confusing to use it for something else. Any comments on that?
newSth calls addChild (from Database.HDBC.DriverUtils).

a) Do we need to keep using Statements, because this is the way query handles are released?
b) Isn't this a (small) space leak? The ChildList gains a new element every time 'newSth' is executed, doesn't it?
And finally, what does 'newSth' mean? (I can guess most of it, but where does the 'h' come from?)

I hope what I wrote is understandable and I would really appreciate any feedback on this.

Cheers and thanks,
Sönke

zenzike · 2012-12-01T17:42:11Z

Hi Sönke, I'm really glad you're keen to help with this and would welcome your contributions.

I actually suggest that you don't create a new Execution file for this purpose, since I don't think much is gained in terms of untangling executions and prepared statements, and currently the various modules shadow those provided in the main HDBC library: this is a design that's roughly shared across the various other HDBC backends. I think it would be just as good to provide a new function that really does create prepared statements.

soenkehahn · 2012-12-04T13:10:49Z

I am working on this right now, thanks.

soenkehahn · 2012-12-05T14:31:59Z

I have done some patches for this in my fork: https://github.com/soenkehahn/hdbc-postgresql
It is basically working as far as I can tell. There are some caveats and problems though:

postgresql does not support all types of queries to be prepared. For example "SET client_encoding TO utf8;" cannot be prepared. This may break existing code. Especially since quickQuery (and others?) use prepared statements internally.
Sometimes, preparing a statement is less efficient than executing it unprepared (see http://www.postgresql.org/docs/9.1/static/sql-prepare.html , section "Notes"). Right now, users of hdbc-postgresql cannot perform an unprepared query while obtaining the resulting rows, if I understand correctly. This is a limitation of the HDBC-API, though.
Deallocation of prepared statements is not done. There is a new (failing) testcase that demonstrates this. I am not sure how to implement that.
- If we implement that in public_ffinish, we have to do some refactoring as that function is also called in places where we definitely don't want the statement to be deallocated (e.g. fexecute).
Some internal error handling could be better. There are some (one?) FIXMEs in the code for this.
I have not done any benchmarking to verify that this can yield better performance. I hope to do so in the coming days.

I would be happy for any comments on these (or other) issues.

zenzike · 2012-12-05T14:52:00Z

Hm, lots of interesting issues here. Point by point:

I hadn't realised that not all queries could be prepared. Perhaps this is something we should think about changing in the HDBC API, and allow users to specify whether a query should be prepared or not: perhaps having these distinguished at the type level would be useful?
If we allow the programmer to choose which version is being run, then at least this is back in their hands.
Obviously the deallocation problem is an issue that needs to be dealt with. I seem to remember that this was done nicely in the HDBC-ODBC interface.
Yes, the internal error handling needs a bit of a face-lift.
I wrote the beginnings of a benchmarking suite here: https://github.com/zenzike/hdbc-performance perhaps you could build on this?

zenzike · 2012-12-05T17:44:39Z

Having thought about this some more, I think that it's fine for run to always perform unprepared queries, and for quickQuery to always execute prepared queries. You're right that there might be some breakage if people have been using quickQuery for statements that can't be prepared, but I suspect that run or runRaw would (and should!) have been used instead, given that quickQuery has always explicitly been (in theory) a prepared statement. You're right that the interface offers no way of getting results back from a query that isn't prepared, but I don't think this is generally a limitation.

soenkehahn · 2012-12-05T17:55:05Z

I agree. I'm assuming we'll leave it like this then.

I will try to tackle deallocation and internal error handling.

soenkehahn · 2013-01-23T17:25:43Z

Hi!

I've been looking into deallocation of prepared statements today. I thought this should be done by the operation in the 'finish'-field of the statement, but that was apperently a misunderstanding: 'finish' is just for reinitializing a statement after a single execution of the statement.

As far as I can see, there is no way to deallocate a prepared statement using the HDBC-API. Is this correct?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Real support for Prepared Statements #7

Real support for Prepared Statements #7

lpsmith commented Feb 10, 2011

jgoerzen commented Feb 10, 2011

soenkehahn commented Nov 19, 2012

zenzike commented Dec 1, 2012

soenkehahn commented Dec 4, 2012

soenkehahn commented Dec 5, 2012

zenzike commented Dec 5, 2012

zenzike commented Dec 5, 2012

soenkehahn commented Dec 5, 2012

soenkehahn commented Jan 23, 2013

Real support for Prepared Statements #7

Real support for Prepared Statements #7

Comments

lpsmith commented Feb 10, 2011

jgoerzen commented Feb 10, 2011

soenkehahn commented Nov 19, 2012

zenzike commented Dec 1, 2012

soenkehahn commented Dec 4, 2012

soenkehahn commented Dec 5, 2012

zenzike commented Dec 5, 2012

zenzike commented Dec 5, 2012

soenkehahn commented Dec 5, 2012

soenkehahn commented Jan 23, 2013