GeoTools
  1. GeoTools
  2. GEOT-2235

Postgis NG + large data set + non transactional read -> OOM

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.5.2
    • Fix Version/s: 2.5.3
    • Component/s: jdbc
    • Labels:
      None

      Description

      I was trying out the postgis NG datastore against the old one using the same old Texas roads database
      when I found out that the NG one was systematically throwing OOM.

      A little investigation shows that PostgreSQL can do partial fetches only if autocommit is false:
      http://www.nabble.com/ResultSet---setFetchSize-fails-to-stop-heap-failures-td17519869.html#a17521489
      http://jdbc.postgresql.org/documentation/83/query.html#query-with-cursor

      Looking into the old jdbc datastores code one can see that auto-commit is not set not even
      for Transaction.AUTO_COMMIT. The upside is that partial fetches do work, the downside
      is that apparently one has to use transactions to ensure changes are properly written?

      Yet, the old postgis tests ensured that operations were working even on auto_commit,
      not sure if it was an unintended side effect or if the transactions are committed on close unless
      explicitly rolled back (I would find this odd thought)...

      I made a few tests by adding setAutoCommit(false) to PostgisDialect.initializeConnection...
      of course test failed. I had to add two things to get tests working again:

      • add an explicit con.commit() in JDBCTestSetup.run()
      • add an explicit con.commit() in PostgisDialect.postCreateTable

      After those two gross hacks all tests are passing again but... ugh...
      On the other side having a datastore that cannot load huge amounts of data
      by streaming them is a major problem...

      Opinions?

        Activity

        Hide
        Justin Deoliveira added a comment -
        Hmmm... ugly indeed. It seems we are screwed either way. In one case (wms / read only case) we take a serious performance penalty. In the other (wfs / writing) we lose the ability to write in auto commit mode.

        What would happen if we disabled write access (throw an exception) when using Transaction.AUTO_COMMIT. I don't think geoserver would be affected since it always uses a proper transaction. If this was the case we could always use Connection.setAutoCommit(false) for postgis-ng. And since we disallow any use of AUTO_COMMIT for writing we don't have to worry about nasty hacks to try and commit the connection.

        The only other alternative i could think of is coming up with a new type of transaction type... say READ_ONLY. It would not be ideal since it would require api changes... but it would give us the chance to achieve good performance for really the only case that it matters.

        Either way sort of nasty. Sorry I don't have a good answer for this one.
        Show
        Justin Deoliveira added a comment - Hmmm... ugly indeed. It seems we are screwed either way. In one case (wms / read only case) we take a serious performance penalty. In the other (wfs / writing) we lose the ability to write in auto commit mode. What would happen if we disabled write access (throw an exception) when using Transaction.AUTO_COMMIT. I don't think geoserver would be affected since it always uses a proper transaction. If this was the case we could always use Connection.setAutoCommit(false) for postgis-ng. And since we disallow any use of AUTO_COMMIT for writing we don't have to worry about nasty hacks to try and commit the connection. The only other alternative i could think of is coming up with a new type of transaction type... say READ_ONLY. It would not be ideal since it would require api changes... but it would give us the chance to achieve good performance for really the only case that it matters. Either way sort of nasty. Sorry I don't have a good answer for this one.
        Hide
        Andrea Aime added a comment -
        Ok, found a solution to this issue. The trick is to set auto commit to false only when reading, that is, in JDBCFeatureStore.getReaderInternal.
        Given that the connection is going to be given to the reader there are only two possibilities:
        - we are in a transaction other than Transaction.AUTO_COMMIT. In that case we don't need to
          do anything
        - we are in a Transaction.AUTO_COMMIT, in that case the reader will own the connection
          and close it when done, so we can safely swith to auto-commit=false

        The patch becomes a simple:
        {code}
        if(getState().getTransaction() == Transaction.AUTO_COMMIT)
                        cx.setAutoCommit(false);
        {code}
        before starting to create the sql. For cleanness, the whole block at line 500 can be rewriteen as:

        {code}
               try {
                    // this allows PostGIS to page the results and respect the fetch size
                    if(getState().getTransaction() == Transaction.AUTO_COMMIT)
                        cx.setAutoCommit(false);
                    
                    SQLDialect dialect = getDataStore().getSQLDialect();
                    if ( dialect instanceof PreparedStatementSQLDialect ) {
                        PreparedStatement ps = getDataStore().selectSQLPS(querySchema, preFilter, query.getSortBy(), cx);
                        reader = new JDBCFeatureReader( ps, cx, this, querySchema, query.getHints() );
                    } else {
                        //build up a statement for the content
                        String sql = getDataStore().selectSQL(querySchema, preFilter, query.getSortBy());
                        getDataStore().getLogger().fine(sql);
            
                        reader = new JDBCFeatureReader( sql, cx, this, querySchema, query.getHints() );
                    }
                } catch (Exception e) {
                    // close the connection
                    getDataStore().closeSafe(cx);
                    // safely rethrow
                    throw (IOException) new IOException().initCause(e);
                }
        {code}

        Tried, works fine, other database tests keep on working. Ok to commit?
        Show
        Andrea Aime added a comment - Ok, found a solution to this issue. The trick is to set auto commit to false only when reading, that is, in JDBCFeatureStore.getReaderInternal. Given that the connection is going to be given to the reader there are only two possibilities: - we are in a transaction other than Transaction.AUTO_COMMIT. In that case we don't need to   do anything - we are in a Transaction.AUTO_COMMIT, in that case the reader will own the connection   and close it when done, so we can safely swith to auto-commit=false The patch becomes a simple: {code} if(getState().getTransaction() == Transaction.AUTO_COMMIT)                 cx.setAutoCommit(false); {code} before starting to create the sql. For cleanness, the whole block at line 500 can be rewriteen as: {code}        try {             // this allows PostGIS to page the results and respect the fetch size             if(getState().getTransaction() == Transaction.AUTO_COMMIT)                 cx.setAutoCommit(false);                          SQLDialect dialect = getDataStore().getSQLDialect();             if ( dialect instanceof PreparedStatementSQLDialect ) {                 PreparedStatement ps = getDataStore().selectSQLPS(querySchema, preFilter, query.getSortBy(), cx);                 reader = new JDBCFeatureReader( ps, cx, this, querySchema, query.getHints() );             } else {                 //build up a statement for the content                 String sql = getDataStore().selectSQL(querySchema, preFilter, query.getSortBy());                 getDataStore().getLogger().fine(sql);                      reader = new JDBCFeatureReader( sql, cx, this, querySchema, query.getHints() );             }         } catch (Exception e) {             // close the connection             getDataStore().closeSafe(cx);             // safely rethrow             throw (IOException) new IOException().initCause(e);         } {code} Tried, works fine, other database tests keep on working. Ok to commit?
        Hide
        Justin Deoliveira added a comment -
        Nice, I love when the morning brings a simple solution :). Patch looks good, commit as you please andrea.
        Show
        Justin Deoliveira added a comment - Nice, I love when the morning brings a simple solution :). Patch looks good, commit as you please andrea.
        Hide
        Andrea Aime added a comment -
        Fixed on 2.5.x and trunk
        Show
        Andrea Aime added a comment - Fixed on 2.5.x and trunk

          People

          • Assignee:
            Andrea Aime
            Reporter:
            Andrea Aime
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: