GeoTools
  1. GeoTools
  2. GEOT-219

Improve JDBCTextFeatureWriter to use Prepared Statements

    Details

    • Type: Improvement Improvement
    • Status: Closed Closed
    • Priority: Minor Minor
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: data
    • Labels:
      None

      Description

      Jody says we'll see speed increases if we use prepared statements, and I imagine he's write. This will require a bit of rearchitecture I imagine, to prepare the statement before's it's used. Oracle is now using the JDBCTextFeatureWriter, as I couldn't get it working with the JDBCFeatureWriter, and couldn't figure out why it wouldn't, just got cryptic errors about not being able to write, something to do with the fid column. This means that this fix will help out both postgis and oracle (but it also means you need to test on both). Jody says he has some code in his oracle/sdo work that he did to use structs with prepared statements. Perhaps he will comment on the exact nature of this.

        Issue Links

          Activity

          Hide
          Andrea Aime added a comment -
          For the record, Postgis is using prepared statements for writes when using the wkb4j libraries. As for a possible speed increase, the following applies:
          * postgres driver does not support prepared statements, it just fakes them, so there is no speed increase in this case (I needed to use them in order to support wkb4j);
          * a database that really supports them will optimize the query just once, and then only execute it at each call. Since usually one more round trip to the database is made, there is a speed increase only if you make enough inserts/updates/deletes, usually 20 to 50 depending on the database and the nature of the query.

          To really get a sizeable speed increase we should use batch of statements, that really cut the round trips to the database (using addbatch/executebatch). This way the communication with the database occurs only on executebatch. I propose the following:
          * you port the prepared statement code from postgres to the other databases;
          * I will follow, back from the holidays, and try to add the batch of statements support with some classes of my own that make statement batching as transparent as possible.

          What do you think?
          Show
          Andrea Aime added a comment - For the record, Postgis is using prepared statements for writes when using the wkb4j libraries. As for a possible speed increase, the following applies: * postgres driver does not support prepared statements, it just fakes them, so there is no speed increase in this case (I needed to use them in order to support wkb4j); * a database that really supports them will optimize the query just once, and then only execute it at each call. Since usually one more round trip to the database is made, there is a speed increase only if you make enough inserts/updates/deletes, usually 20 to 50 depending on the database and the nature of the query. To really get a sizeable speed increase we should use batch of statements, that really cut the round trips to the database (using addbatch/executebatch). This way the communication with the database occurs only on executebatch. I propose the following: * you port the prepared statement code from postgres to the other databases; * I will follow, back from the holidays, and try to add the batch of statements support with some classes of my own that make statement batching as transparent as possible. What do you think?
          Hide
          Chris Holmes added a comment -
          That's too bad postgres doesn't actually support it. Where is the code for the prepared statements in postgis? I couldn't find it (though I wasn't quite sure where to look). I'll check it out.

          This actually isn't an incredibly high priority for me, I just was changing oracle to use the JDBCTextReader, since it actually worked, unlike the JDBCFeatureWriter. Jody wanted me to put a note that it could get faster with prepared statements. Someday I'll work on speed increases for writing, but it may be a while. I think things are good enough now - I'll ask for your help when I start to think about speed - and we can try out the batch queries.
          Show
          Chris Holmes added a comment - That's too bad postgres doesn't actually support it. Where is the code for the prepared statements in postgis? I couldn't find it (though I wasn't quite sure where to look). I'll check it out. This actually isn't an incredibly high priority for me, I just was changing oracle to use the JDBCTextReader, since it actually worked, unlike the JDBCFeatureWriter. Jody wanted me to put a note that it could get faster with prepared statements. Someday I'll work on speed increases for writing, but it may be a while. I think things are good enough now - I'll ask for your help when I start to think about speed - and we can try out the batch queries.
          Hide
          Chris Holmes added a comment -
          This isn't getting fixed soon...
          Show
          Chris Holmes added a comment - This isn't getting fixed soon...
          Hide
          Jody Garnett added a comment -
          This is not a blocker.
          Show
          Jody Garnett added a comment - This is not a blocker.
          Hide
          Jody Garnett added a comment -
          Confirmed that the approach works in Oracle, will revisit and see if the approach can be generalized.
          Show
          Jody Garnett added a comment - Confirmed that the approach works in Oracle, will revisit and see if the approach can be generalized.
          Hide
          Gabriel Roldan added a comment -
          PostGISDataStore is already using prepared statements for inserts and updates.

          The unresolved problem is still dealing with potentially malicious literals in filters. This is because a normal read only query may contain a filter literal with malicious code, I still need to get a good test case for it, but to completely solve the problem the approach would be to turn the Filter2SQL visitor into one that produces a statement usable in a JDBC prepared statement and provides the list of value objects to fill in the prepared statement value holders.

          like in {{SELECT a,b,c from FT where b like ? AND c = ?}} instead of {{SELECT a,b,c from FT where b like 'fake'; drop table FT;...}
          Show
          Gabriel Roldan added a comment - PostGISDataStore is already using prepared statements for inserts and updates. The unresolved problem is still dealing with potentially malicious literals in filters. This is because a normal read only query may contain a filter literal with malicious code, I still need to get a good test case for it, but to completely solve the problem the approach would be to turn the Filter2SQL visitor into one that produces a statement usable in a JDBC prepared statement and provides the list of value objects to fill in the prepared statement value holders. like in {{SELECT a,b,c from FT where b like ? AND c = ?}} instead of {{SELECT a,b,c from FT where b like 'fake'; drop table FT;...}
          Hide
          Gabriel Roldan added a comment -
          closing as won't fix as the jdbc-ng datastores can already use prepared statements for good
          Show
          Gabriel Roldan added a comment - closing as won't fix as the jdbc-ng datastores can already use prepared statements for good

            People

            • Assignee:
              Gabriel Roldan
              Reporter:
              Chris Holmes
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: