GeoTools
  1. GeoTools
  2. GEOT-1286

Add Concurrency Support to CRS Authority

    Details

    • Type: Improvement Improvement
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Won't Fix
    • Affects Version/s: 2.4-M0, 2.5-M1
    • Fix Version/s: 2.5.9
    • Component/s: referencing
    • Labels:
      None

      Description

      Please see http://docs.codehaus.org/display/GEOTOOLS/Improve+CRSAuthority+Concurrency+Caching+and+Connection+Use for the design under discussion.

      In a high-volume environment, geotools does not perform well. This request is to improve the CRS Authority Factories so we can attack them with many many concurrent users and live to tell the tale. We also need to ensure that desktop and scientific applications do not suffer a significant performance hit as a result of this change. The current implementation has a caching technique, and this must be maintained and shown to work as well.

      For the sake of discussion we will look at a concrete example, FactoryOnOracleSQL/FactoryUsingOracleSQL (located in /unsupported/epsg-oracle and /library/referencing).

      Allowing Multiple Users
      1. Use an ObjectPool of workers (ie FactoryUsingOracleSQL)
      2. Make FactoryUsingOracleSQL Threadsafe
      3. Use Fire and Forget Worker Factory

      Cache Handling:
      4. Minimalize Connection Use Time
      5. Hack Cache to Package Visibility
      6. Worker Factory per Thread
      7. Isolate Cache into a Separate Object and Inject
      8. Seperate out responsibilities so "worker" just provides Definition and does not need the cache

      Connection use:
      9. Use ObjectPool lifecycle methods
      10. Swap out Connection for DataSource
      11. Use ConnectionPolicy Stratagy Object

      The above solutions have been packaged up as four alternatives on this page:
      http://docs.codehaus.org/display/GEOTOOLS/CRS+Authorty+Alternative+Proposals

      Worker Pool
      -Threads: Use ObjectPool to manage multiple workers

      • Cache: Isolate cache into seperate object, and inject into workers so they can perform their own cache check
      • Connection: Use ObjectPool lifecycle methods to close connection, tune ObjectPool within limts of provided DataSource

      Fire and Forget

      • Threads: Create a FactoryUsingOracleSQL as needed in a "Fire and Forget" manner
      • Cache: Workers can keep a back pointer, and call the parent to check the cache
      • Connection : Workers must limit connection use to just the duration of the executeStatement

      Worker DAO Pool

      • Threads : Use ObjectPool to manage multiple workers
      • Cache : Limit cache use to OracleEPSGAuthority, limit the scope of workers to only providing the definition (not object creation)
      • Connection: Use ObjectPool lifecycle methods to close connection, tune ObjectPool within limits of provided DataSource

      Just do it:

      • Threads: Create a Collection of Workers and manage it ourselves, keeping track of the Thread so we recursively use the same worker
      • Cache: Workers can keep a back pointer, and call the parent to check the cache
      • Connection: Choose a connection policy strategy object based on a new Hint

        Issue Links

          Activity

          Hide
          Jody Garnett added a comment -
          The naming of things (everything is a Factory!) has been a barrier to communication - Martin is "letting" us rename the support classes.

          Please consider the names - and comment on them here if you have a better suggestion (but please keep in mind they are not the subject of this proposal - fixing the three problems above must be done).
          Show
          Jody Garnett added a comment - The naming of things (everything is a Factory!) has been a barrier to communication - Martin is "letting" us rename the support classes. Please consider the names - and comment on them here if you have a better suggestion (but please keep in mind they are not the subject of this proposal - fixing the three problems above must be done).
          Show
          Jody Garnett added a comment - Breakout IRC meeting: - http://docs.codehaus.org/display/GEOTOOLS/2007/06/05/Breakout+IRC+-+CRS+Authority+Proposal
          Hide
          Jody Garnett added a comment -
          Martin asked that we make a single implementation of ReferencingObjectCache and use it for both *findPool* and *pool* ...

          This change will work; but we will need to still capture the Level 0 <- Level 1 cache relationship - so ReferencingObjectCache will need a "parent" reference.

          Also to prevent the DirectCRSAuthority implementations from having a bunch of if (cache == null ) code we need a NullReferencingObjectCache.

          Thanks for the feedback martin.
          Show
          Jody Garnett added a comment - Martin asked that we make a single implementation of ReferencingObjectCache and use it for both *findPool* and *pool* ... This change will work; but we will need to still capture the Level 0 <- Level 1 cache relationship - so ReferencingObjectCache will need a "parent" reference. Also to prevent the DirectCRSAuthority implementations from having a bunch of if (cache == null ) code we need a NullReferencingObjectCache. Thanks for the feedback martin.
          Hide
          Jody Garnett added a comment -
          Naming IRC with acuster/jgarnett:

          acuster: right, but if there are three classes that, for geotools, have exactly the same role, merely on different backends,
          acuster: I would expect the alternatives to be together
          acuster: then a user learns what the different clusters do in geotools
          acuster: and picks the relevant backend
          jgarnett: okay here are my thoughts,
          jgarnett: the function of the classes have to be marked with an interface; and the naming rules have that as the last thing .... XXXXXAuthority
          jgarnett: the "who" (ie who is the authority)
          jgarnett: should be the first thing ... EPSG....Authority, or AUTO2.....Authority
          jgarnett: and we can arrange the middle of the name to reflect the fact that there is alternatives
          jgarnett: Threaded/Direct is an implementation detail that does not matter so much to the understanding of purpose
          jgarnett: comments? confusion? who cares ...
          ***acuster trusts you to make a good compromise, he merely wanted to bring up the issue.
          jgarnett: Here is what that would look like - sorted:
          jgarnett: AbstractAuthority
          jgarnett: AUTOAuthority
          jgarnett: AUTO2Authority
          jgarnett: EPSGAnsiThreadedAuthority
          jgarnett: EPSGDirectAuthority
          jgarnett: EPSGHSQLThreadedAuthority
          jgarnett: EPSGHSQLDirectAuthority
          jgarnett: EPSGOracleThreadedAuthority
          jgarnett: EPSGOracleDirectAuthority
          jgarnett: EPSGThreadedAuthority
          jgarnett: Thanks for your trust acuster; just thinking this through on IRC
          Show
          Jody Garnett added a comment - Naming IRC with acuster/jgarnett: acuster: right, but if there are three classes that, for geotools, have exactly the same role, merely on different backends, acuster: I would expect the alternatives to be together acuster: then a user learns what the different clusters do in geotools acuster: and picks the relevant backend jgarnett: okay here are my thoughts, jgarnett: the function of the classes have to be marked with an interface; and the naming rules have that as the last thing .... XXXXXAuthority jgarnett: the "who" (ie who is the authority) jgarnett: should be the first thing ... EPSG....Authority, or AUTO2.....Authority jgarnett: and we can arrange the middle of the name to reflect the fact that there is alternatives jgarnett: Threaded/Direct is an implementation detail that does not matter so much to the understanding of purpose jgarnett: comments? confusion? who cares ... ***acuster trusts you to make a good compromise, he merely wanted to bring up the issue. jgarnett: Here is what that would look like - sorted: jgarnett: AbstractAuthority jgarnett: AUTOAuthority jgarnett: AUTO2Authority jgarnett: EPSGAnsiThreadedAuthority jgarnett: EPSGDirectAuthority jgarnett: EPSGHSQLThreadedAuthority jgarnett: EPSGHSQLDirectAuthority jgarnett: EPSGOracleThreadedAuthority jgarnett: EPSGOracleDirectAuthority jgarnett: EPSGThreadedAuthority jgarnett: Thanks for your trust acuster; just thinking this through on IRC
          Hide
          Martin Desruisseaux added a comment -
          "EPSGAnsiThreadedAuthority" sound like a confusing name to me! We don't know who is the authority (EPSG? Ansi?) We don't know what Ansi is doing in this picture (nothing tell us that it is about Ansi SQL).

          I fully agree with the value of grouping related classes together, but I don't think that we use the alphabetical order for that if it leads to confusing names. Grouping related topic is javadoc tool job. The tool already do that at the package level. It doesn't do that at the classes or methods level yet, but this is planned for a future version. This is JSR-260:

              http://www.jcp.org/en/jsr/detail?id=260
              https://javadoctags.dev.java.net/

          So I would said: don't choose confusing names in order to workaround a temporary javadoc tool limitation.
          Show
          Martin Desruisseaux added a comment - "EPSGAnsiThreadedAuthority" sound like a confusing name to me! We don't know who is the authority (EPSG? Ansi?) We don't know what Ansi is doing in this picture (nothing tell us that it is about Ansi SQL). I fully agree with the value of grouping related classes together, but I don't think that we use the alphabetical order for that if it leads to confusing names. Grouping related topic is javadoc tool job. The tool already do that at the package level. It doesn't do that at the classes or methods level yet, but this is planned for a future version. This is JSR-260:      http://www.jcp.org/en/jsr/detail?id=260      https://javadoctags.dev.java.net/ So I would said: don't choose confusing names in order to workaround a temporary javadoc tool limitation.
          Hide
          Jody Garnett added a comment -
          IDEA: Martin had the idea to use ConcurrentAuthorityFactory (since he pointed out the DirectAuthorityFactory implementations are actually thread safe).

          Martin "threaded" does not indicate "thread safe" - it indicates that internally the ThreadedAuthorityFactory is going to deploy a bunch of threads.
          Show
          Jody Garnett added a comment - IDEA: Martin had the idea to use ConcurrentAuthorityFactory (since he pointed out the DirectAuthorityFactory implementations are actually thread safe). Martin "threaded" does not indicate "thread safe" - it indicates that internally the ThreadedAuthorityFactory is going to deploy a bunch of threads.
          Hide
          Jody Garnett added a comment -
          CORRECTION: Martin has pointed out that BufferedAuthorityFactory was pulling triple duty!
          - it could be used on its own (to buffer a already working a authority factory). The "backingStore" passed into the constructor.
          - it acts as a super class for DefferedAuthorityFactory. The "backingStore" is created as needed, and a Timer used to get rid of it.

          And we have plans to expand it's roll to handle a pool of "backingStores" ...

          So here is the plan for later:
          - an abstract class for "BufferedAuthorityFactory" - ie one that uses a cache, and then talks DirectAuthorityFactory
          - create "BufferedAuthorityDecorator" for one that just wraps an existing authority factory
          - create "ThreadedAuthorityFactory" for one that manages a bunch of workers and a cache.
          Show
          Jody Garnett added a comment - CORRECTION: Martin has pointed out that BufferedAuthorityFactory was pulling triple duty! - it could be used on its own (to buffer a already working a authority factory). The "backingStore" passed into the constructor. - it acts as a super class for DefferedAuthorityFactory. The "backingStore" is created as needed, and a Timer used to get rid of it. And we have plans to expand it's roll to handle a pool of "backingStores" ... So here is the plan for later: - an abstract class for "BufferedAuthorityFactory" - ie one that uses a cache, and then talks DirectAuthorityFactory - create "BufferedAuthorityDecorator" for one that just wraps an existing authority factory - create "ThreadedAuthorityFactory" for one that manages a bunch of workers and a cache.
          Hide
          Jody Garnett added a comment -
          Martin we are stuck on this one; near as I can tell I we left off having trouble with the HsqlDialectEpsgFactory and find Object.
          Show
          Jody Garnett added a comment - Martin we are stuck on this one; near as I can tell I we left off having trouble with the HsqlDialectEpsgFactory and find Object.
          Hide
          Martin Desruisseaux added a comment -
          Yes I'm still commited to look at the code. One additional reason is that the Cache classes (among others) that you wrote in org.geotools.util may very well be useful to me soon.

          The problem is that I'm so late on my schedule... I do not dare anymore to make temptative schedule, but my task stack is approximatively:

            * Finish MosaicImageReader work
            * Switch from JSR-108 to JSR-275
            * SVN cleanup
            * Module renaming (gt2 prefix removal)
            * Concurrent CRS
            * ...
          Show
          Martin Desruisseaux added a comment - Yes I'm still commited to look at the code. One additional reason is that the Cache classes (among others) that you wrote in org.geotools.util may very well be useful to me soon. The problem is that I'm so late on my schedule... I do not dare anymore to make temptative schedule, but my task stack is approximatively:   * Finish MosaicImageReader work   * Switch from JSR-108 to JSR-275   * SVN cleanup   * Module renaming (gt2 prefix removal)   * Concurrent CRS   * ...
          Hide
          Jody Garnett added a comment -
          I will assign this bug to you for now then.
          Show
          Jody Garnett added a comment - I will assign this bug to you for now then.
          Hide
          Martin Desruisseaux added a comment -
          Code review and rewrite done in geotidy.
          Show
          Martin Desruisseaux added a comment - Code review and rewrite done in geotidy.
          Hide
          Jody Garnett added a comment -
          Attach patch isolates database connection/creation so it can be used by all implementations
          Show
          Jody Garnett added a comment - Attach patch isolates database connection/creation so it can be used by all implementations
          Hide
          Jody Garnett added a comment -
          Review geot-1286.patch it allows the mediator/delegate combo to use your new database in a jar. Basically catches us up to the point where we can run the code and see what is working or not.
          Show
          Jody Garnett added a comment - Review geot-1286.patch it allows the mediator/delegate combo to use your new database in a jar. Basically catches us up to the point where we can run the code and see what is working or not.
          Hide
          Jody Garnett added a comment -
          I have with drawn this change proposal and patch as its time has passed due to lack of review while the work was funded.
          Show
          Jody Garnett added a comment - I have with drawn this change proposal and patch as its time has passed due to lack of review while the work was funded.

            People

            • Assignee:
              Jody Garnett
              Reporter:
              Cory Horner
            • Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: