Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Real support for Prepared Statements #7

Open
lpsmith opened this issue Feb 10, 2011 · 9 comments
Open

Real support for Prepared Statements #7

lpsmith opened this issue Feb 10, 2011 · 9 comments

Comments

@lpsmith
Copy link
Contributor

lpsmith commented Feb 10, 2011

The HDBC API supports "prepared" statements, however the PostgreSQL currently doesn't do anything with this. Instead, the query string will be sent to the database repeatedly, which means that the database will have to repeatedly re-parse the query and re-create a query plan.

@jgoerzen
Copy link
Member

You are quite right, which is why I have this comment in Statement.hsc:

-- FIXME: we currently do no prepare optimization whatsoever.

I probably don't have time to work on this right away, but I agree it should be fixed.

@soenkehahn
Copy link
Contributor

Hi!

I am interested in working on this. I would like to get my work upstream, so I have some questions about the structure of the current code and how to proceed.

Currently, DB.HDBC.Statement.newSth is used to create new Statements. These Statements don't provide statement preparation. Also, newSth is used in DB.HDBC.Connection to implement frunRaw, frun and fgetTables. These are places where statement preparation should not be done. So I thought, the first step towards fixing this ticket would be to untangle the two different cases (that is, using newSth for unprepared and prepared Statements). My fork shows some preliminary work on this: soenkehahn@2d070dc
Currently the fork does not behave differently, but the functions for prepared statements and unprepared statements are split up.

Here are my questions:

  1. Is this generally the right way to do this? Did I misunderstand something?

  2. What do you think about the module renaming/refactoring?

  3. I would think that in the unprepared case we shouldn't use values of type Statement internally. If I understand correctly this type is meant for prepared statements and its very confusing to use it for something else. Any comments on that?

  4. newSth calls addChild (from Database.HDBC.DriverUtils).

    a) Do we need to keep using Statements, because this is the way query handles are released?
    b) Isn't this a (small) space leak? The ChildList gains a new element every time 'newSth' is executed, doesn't it?

  5. And finally, what does 'newSth' mean? (I can guess most of it, but where does the 'h' come from?)

I hope what I wrote is understandable and I would really appreciate any feedback on this.

Cheers and thanks,
Sönke

@zenzike
Copy link
Member

zenzike commented Dec 1, 2012

Hi Sönke, I'm really glad you're keen to help with this and would welcome your contributions.

I actually suggest that you don't create a new Execution file for this purpose, since I don't think much is gained in terms of untangling executions and prepared statements, and currently the various modules shadow those provided in the main HDBC library: this is a design that's roughly shared across the various other HDBC backends. I think it would be just as good to provide a new function that really does create prepared statements.

@soenkehahn
Copy link
Contributor

I am working on this right now, thanks.

@soenkehahn
Copy link
Contributor

I have done some patches for this in my fork: https://github.com/soenkehahn/hdbc-postgresql
It is basically working as far as I can tell. There are some caveats and problems though:

  • postgresql does not support all types of queries to be prepared. For example "SET client_encoding TO utf8;" cannot be prepared. This may break existing code. Especially since quickQuery (and others?) use prepared statements internally.
  • Sometimes, preparing a statement is less efficient than executing it unprepared (see http://www.postgresql.org/docs/9.1/static/sql-prepare.html , section "Notes"). Right now, users of hdbc-postgresql cannot perform an unprepared query while obtaining the resulting rows, if I understand correctly. This is a limitation of the HDBC-API, though.
  • Deallocation of prepared statements is not done. There is a new (failing) testcase that demonstrates this. I am not sure how to implement that.
    • If we implement that in public_ffinish, we have to do some refactoring as that function is also called in places where we definitely don't want the statement to be deallocated (e.g. fexecute).
  • Some internal error handling could be better. There are some (one?) FIXMEs in the code for this.
  • I have not done any benchmarking to verify that this can yield better performance. I hope to do so in the coming days.

I would be happy for any comments on these (or other) issues.

@zenzike
Copy link
Member

zenzike commented Dec 5, 2012

Hm, lots of interesting issues here. Point by point:

  • I hadn't realised that not all queries could be prepared. Perhaps this is something we should think about changing in the HDBC API, and allow users to specify whether a query should be prepared or not: perhaps having these distinguished at the type level would be useful?
  • If we allow the programmer to choose which version is being run, then at least this is back in their hands.
  • Obviously the deallocation problem is an issue that needs to be dealt with. I seem to remember that this was done nicely in the HDBC-ODBC interface.
  • Yes, the internal error handling needs a bit of a face-lift.
  • I wrote the beginnings of a benchmarking suite here: https://github.com/zenzike/hdbc-performance perhaps you could build on this?

@zenzike
Copy link
Member

zenzike commented Dec 5, 2012

Having thought about this some more, I think that it's fine for run to always perform unprepared queries, and for quickQuery to always execute prepared queries. You're right that there might be some breakage if people have been using quickQuery for statements that can't be prepared, but I suspect that run or runRaw would (and should!) have been used instead, given that quickQuery has always explicitly been (in theory) a prepared statement. You're right that the interface offers no way of getting results back from a query that isn't prepared, but I don't think this is generally a limitation.

@soenkehahn
Copy link
Contributor

I agree. I'm assuming we'll leave it like this then.

I will try to tackle deallocation and internal error handling.

@soenkehahn
Copy link
Contributor

Hi!

I've been looking into deallocation of prepared statements today. I thought this should be done by the operation in the 'finish'-field of the statement, but that was apperently a misunderstanding: 'finish' is just for reinitializing a statement after a single execution of the statement.

As far as I can see, there is no way to deallocate a prepared statement using the HDBC-API. Is this correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants