GeoTools

Provides EPSG data in a java embedded database

Details

  • Type: Wish Wish
  • Status: Closed Closed
  • Priority: Minor Minor
  • Resolution: Fixed
  • Affects Version/s: 2.1.M0
  • Fix Version/s: 2.2.M0
  • Component/s: referencing
  • Labels:
    None

Description

Current CoordinateSystemEPSGFactory implementation connects to a MS-Access database, because it is the only database format provided by EPSG (http://www.epsg.org). Unfortunatly, this dependance to MS-Access force CoordinateSystemEPSGFactory to uses non-standard SQL syntax, since Micro$oft products have their own "standards". We should:

1) Determine what should be the tables structure (for example
table and column names) for MySQL and PostgreeSQL ports.

2) Provide an implementation of the 'adaptSQL(String statement)'
method for standard SQL syntax, like the one used by MySQL and
PostgreeSQL.

3) Provide ESPG data in a java embedded database, in order to make
things more portable. The EPSG data is small and static enough
for that to make sense. Hypersonic and Axion may be considered.

Issue Links

Activity

Hide
Martin Desruisseaux added a comment -
EPSG (http://www.epsg.org/) has released a cross-platform version of their table. Tables are specified in the form of SQL scripts.
Show
Martin Desruisseaux added a comment - EPSG (http://www.epsg.org/) has released a cross-platform version of their table. Tables are specified in the form of SQL scripts.
Hide
Pierrick Brihaye added a comment -
One other possible way is to use an XML file as Thuban does :
http://www.intevation.de/cgi-bin/viewcvs-thuban.cgi/thuban/Resources/Projections/

As 1.4+ JDKs have native XML support, it should be quite easy to load this file, either fully, either partially (on request ?) and to contruct Java objects.

The problem is that the files format is still very dependent from the text files provided with PROJ library and that they have no normative status.
Show
Pierrick Brihaye added a comment - One other possible way is to use an XML file as Thuban does : http://www.intevation.de/cgi-bin/viewcvs-thuban.cgi/thuban/Resources/Projections/ As 1.4+ JDKs have native XML support, it should be quite easy to load this file, either fully, either partially (on request ?) and to contruct Java objects. The problem is that the files format is still very dependent from the text files provided with PROJ library and that they have no normative status.
Hide
Martin Desruisseaux added a comment -
Rueben Schulz fixed CoordinateSystemEPSGFactory for use with latest EPSG database (which doesn't contains a "Codes" table anymore), and also provided an implementation working with databases generated by the EPSG scripts:

    Visit http://www.epsg.org/
    Click on "Download EPSGv6.5 SQL scripts"
    Run the scripts from PostgreSQL (or any other SQL compliant database - not Access)
    Use the CoordinateSystemModifiedEPSGFactory implementation.


This new implementation has not yet been fully tested. It should work at least for PostgreSQL. I have not tested it with Axion, put it should work as well in my understanding.
Show
Martin Desruisseaux added a comment - Rueben Schulz fixed CoordinateSystemEPSGFactory for use with latest EPSG database (which doesn't contains a "Codes" table anymore), and also provided an implementation working with databases generated by the EPSG scripts:     Visit http://www.epsg.org/     Click on "Download EPSGv6.5 SQL scripts"     Run the scripts from PostgreSQL (or any other SQL compliant database - not Access)     Use the CoordinateSystemModifiedEPSGFactory implementation. This new implementation has not yet been fully tested. It should work at least for PostgreSQL. I have not tested it with Axion, put it should work as well in my understanding.
Hide
Anonymous added a comment -
The SQL scripts for Oracle, postgreSQL and MySQL delivered by EPSG on their site are not ready to use (mainly due to the constraints they have added in the DDL script).
Once fixed, the scripts work fine and it is possible to query that database.
Having this as embedded database seems quite heavy, but in the long run useful : more systems, more transformations ...
Why not "deriving" the database into WKT that would be part on the default embedded database ?
I believe that the GT2 API allows saving into WKT that would be the embedded database. This has to be done once for each stable version of the EPSG (as long as there is no incompatibility between the versions).
Show
Anonymous added a comment - The SQL scripts for Oracle, postgreSQL and MySQL delivered by EPSG on their site are not ready to use (mainly due to the constraints they have added in the DDL script). Once fixed, the scripts work fine and it is possible to query that database. Having this as embedded database seems quite heavy, but in the long run useful : more systems, more transformations ... Why not "deriving" the database into WKT that would be part on the default embedded database ? I believe that the GT2 API allows saving into WKT that would be the embedded database. This has to be done once for each stable version of the EPSG (as long as there is no incompatibility between the versions).
Hide
Pierrick Brihaye added a comment -
Show
Pierrick Brihaye added a comment - Hi, See also : http://jira.codehaus.org/secure/ViewIssue.jspa?key=GEOS-147 p.b.
Hide
Martin Desruisseaux added a comment -
It would be possible to use WKT instead of an SQL database (it is right that Geotools has support for WKT), but WKT contains much less informations than the full EPSG database can contains. For example WKT doesn't said anything about the area of validity (most CRS in usage are valid only for some country). WKT doesn't stores remarks or formulas neither. Furthermore, it is not obvious that WKT would save space since this format implies a lot of redundancies (is SQL tables, redundancies are avoided by the use of subtables with relations). WKT is good specifying the CRS used for a particular set of data. I don't think it is well suited for a database.

Andrea Aimes ported the EPSG database into HSQL (Hypersonic database, http://hsqldb.sourceforge.net). The compressed JAR file size is about 800ko if my memory is right. We basically just have to find a place where to put the file. CVS is not well suited for binary files. Subversion would be better, once the move to CodeHaus will be completed.

I don't think that an embeded HSQL database will requires changes to the CoordinateSystemEPSGFactory code. We should just need to set the driver and the database URL once. If a changes is needed, it would probably involve simply the addition of a few lines in 'getDefault()' method in order to discover the JAR file, if present.
Show
Martin Desruisseaux added a comment - It would be possible to use WKT instead of an SQL database (it is right that Geotools has support for WKT), but WKT contains much less informations than the full EPSG database can contains. For example WKT doesn't said anything about the area of validity (most CRS in usage are valid only for some country). WKT doesn't stores remarks or formulas neither. Furthermore, it is not obvious that WKT would save space since this format implies a lot of redundancies (is SQL tables, redundancies are avoided by the use of subtables with relations). WKT is good specifying the CRS used for a particular set of data. I don't think it is well suited for a database. Andrea Aimes ported the EPSG database into HSQL (Hypersonic database, http://hsqldb.sourceforge.net). The compressed JAR file size is about 800ko if my memory is right. We basically just have to find a place where to put the file. CVS is not well suited for binary files. Subversion would be better, once the move to CodeHaus will be completed. I don't think that an embeded HSQL database will requires changes to the CoordinateSystemEPSGFactory code. We should just need to set the driver and the database URL once. If a changes is needed, it would probably involve simply the addition of a few lines in 'getDefault()' method in order to discover the JAR file, if present.
Hide
Pierrick Brihaye added a comment -
Hi,

>It would be possible to use WKT instead of an SQL database (it is >right that Geotools has support for WKT), but WKT contains much less >informations than the full EPSG database can contains.

Right. So... WKT is not *the* solution :-)

>Andrea Aimes ported the EPSG database into HSQL (Hypersonic >database, http://hsqldb.sourceforge.net).

Did he ? That's great news ! I've had some problems with HSQL for 2 reasons :

1) EPSG's SQL is broken (MS Access dumps ?)
2) HSQL doesn't support some EPSG's SQL syntax. See :
http://sourceforge.net/tracker/index.php?func=detail&aid=890537&group_id=23316&atid=378134

So... I assume that Andrea did preprocess EPSG's SQL ?

>The compressed JAR file size is about 800ko if my memory is right. >We basically just have to find a place where to put the file.

About his point : wouldn't it make sense to separate the logical SRS stuff from the concrete SRS stuff, i.e. create an EPSG or whatever module ?

> CVS is not well suited for binary files.

Thank you for providing support to the XML solution :-)

Anyway, I think that a provisionnal full Java solution à la HSQL is a good point !

Cheers,

p.b.
Show
Pierrick Brihaye added a comment - Hi, >It would be possible to use WKT instead of an SQL database (it is >right that Geotools has support for WKT), but WKT contains much less >informations than the full EPSG database can contains. Right. So... WKT is not *the* solution :-) >Andrea Aimes ported the EPSG database into HSQL (Hypersonic >database, http://hsqldb.sourceforge.net). Did he ? That's great news ! I've had some problems with HSQL for 2 reasons : 1) EPSG's SQL is broken (MS Access dumps ?) 2) HSQL doesn't support some EPSG's SQL syntax. See : http://sourceforge.net/tracker/index.php?func=detail&aid=890537&group_id=23316&atid=378134 So... I assume that Andrea did preprocess EPSG's SQL ? >The compressed JAR file size is about 800ko if my memory is right. >We basically just have to find a place where to put the file. About his point : wouldn't it make sense to separate the logical SRS stuff from the concrete SRS stuff, i.e. create an EPSG or whatever module ? > CVS is not well suited for binary files. Thank you for providing support to the XML solution :-) Anyway, I think that a provisionnal full Java solution à la HSQL is a good point ! Cheers, p.b.
Hide
Martin Desruisseaux added a comment -
SQL syntax in CoordinateSystemEPSGFactory target explicitly Access syntax, since Access is the format of the database provided by EPSG.org. However, this factory already provides a hook for modifying the SQL syntax on the fly, and there is already a subclass for standard SQL: ModifiedCoordinateSystemEPSGFactory, or something like this. This factory has already been tested successfully with PostgreSQL and Oracle. My assumption is that it should work with HSQL as well.
Show
Martin Desruisseaux added a comment - SQL syntax in CoordinateSystemEPSGFactory target explicitly Access syntax, since Access is the format of the database provided by EPSG.org. However, this factory already provides a hook for modifying the SQL syntax on the fly, and there is already a subclass for standard SQL: ModifiedCoordinateSystemEPSGFactory, or something like this. This factory has already been tested successfully with PostgreSQL and Oracle. My assumption is that it should work with HSQL as well.
Hide
Pierrick Brihaye added a comment -
Hi,

>SQL syntax in CoordinateSystemEPSGFactory target explicitly Access >syntax, since Access is the format of the database provided by EPSG.org.

Correct.

>However, this factory already provides a hook for modifying the SQL >syntax on the fly, and there is already a subclass for standard SQL: >ModifiedCoordinateSystemEPSGFactory, or something like this.

I wonder whether it wouldn't be a better solution to provide *one* Geotools (?) hosted DB and to dump this DB in a "more" standard SQL that could be used by *many* Geotools users, either as files that would feed an EPSG factory, either as native Java DB files that would be packed in a jar file together will all the necessary stuff (i.e. the DB engine and its driver).

> This factory has already been tested successfully with PostgreSQL and
> Oracle. My assumption is that it should work with HSQL as well.

So... it's just an assumption ? Well... I looks like I am at the same point :-) EPSG guys could be more friendly towards us, could'nt they ?

p.b.
Show
Pierrick Brihaye added a comment - Hi, >SQL syntax in CoordinateSystemEPSGFactory target explicitly Access >syntax, since Access is the format of the database provided by EPSG.org. Correct. >However, this factory already provides a hook for modifying the SQL >syntax on the fly, and there is already a subclass for standard SQL: >ModifiedCoordinateSystemEPSGFactory, or something like this. I wonder whether it wouldn't be a better solution to provide *one* Geotools (?) hosted DB and to dump this DB in a "more" standard SQL that could be used by *many* Geotools users, either as files that would feed an EPSG factory, either as native Java DB files that would be packed in a jar file together will all the necessary stuff (i.e. the DB engine and its driver). > This factory has already been tested successfully with PostgreSQL and > Oracle. My assumption is that it should work with HSQL as well. So... it's just an assumption ? Well... I looks like I am at the same point :-) EPSG guys could be more friendly towards us, could'nt they ? p.b.
Hide
Martin Desruisseaux added a comment -
 I just got the HSQL database from Andrea with the source code he used to test it. I will try it.
Show
Martin Desruisseaux added a comment -  I just got the HSQL database from Andrea with the source code he used to test it. I will try it.
Hide
Jody Garnett added a comment -
CRSService makes use of the ext/epsg module to define this information via a property file (200k).

The embded database would be better.
Show
Jody Garnett added a comment - CRSService makes use of the ext/epsg module to define this information via a property file (200k). The embded database would be better.
Hide
Martin Desruisseaux added a comment -
Since I can hardly do this work in the next few months, unsassigned it from myself in order to provide room for a volunter to pick up this task.
Show
Martin Desruisseaux added a comment - Since I can hardly do this work in the next few months, unsassigned it from myself in order to provide room for a volunter to pick up this task.
Hide
Martin Desruisseaux added a comment -
Didier Richard created a new HSQL database for EPSG 6.6. We tried both the script file (with only SQL statements and the data fully loaded in memory) and the cached tables (created with "CREATE CACHED TABLE" instructions). Strange enough, the binary files created for cached tables are bigger than the SQL scripts in ASCII, even after execution of "SHUTDOWN COMPACT".

Unfortunatly, we are facing a blocking issue with HSQL: there is no way (as far as I know) to bundle the binary cached table in a JAR. We can bundle the SQL scripts, but this is inefficient since it force HSQL to load the full database in memory. A solution may be to expand the database in some subdirectory in user's home directory the first time the HSQL database is used.

An other possibility, maybe preferable, is to explore other database engine. The Derby database from the Apache Software fundation may be a good candidate:

    http://incubator.apache.org/derby/
Show
Martin Desruisseaux added a comment - Didier Richard created a new HSQL database for EPSG 6.6. We tried both the script file (with only SQL statements and the data fully loaded in memory) and the cached tables (created with "CREATE CACHED TABLE" instructions). Strange enough, the binary files created for cached tables are bigger than the SQL scripts in ASCII, even after execution of "SHUTDOWN COMPACT". Unfortunatly, we are facing a blocking issue with HSQL: there is no way (as far as I know) to bundle the binary cached table in a JAR. We can bundle the SQL scripts, but this is inefficient since it force HSQL to load the full database in memory. A solution may be to expand the database in some subdirectory in user's home directory the first time the HSQL database is used. An other possibility, maybe preferable, is to explore other database engine. The Derby database from the Apache Software fundation may be a good candidate:     http://incubator.apache.org/derby/
Hide
Martin Desruisseaux added a comment -
I explored Derby database and came back to HSQL because Derby revealed more difficult to get to work than HSQL (primary/foreigner keys limited to 18 characters; lot of binary files created where HSQL creates just one or two; JAR dependencies 4 times bigger; problematic shutdown procedure in the context of a library; risk of corruption if two JVM use the same database; etc.).

The HSQL plugin is bundle with the EPSG database as a SQL script. The first time that the plugin is run, it execute fully the SQL script and cache the result in the following directory:

    <the temporary directory>/Geotools/Cached databases

Next application launch will reuse this cached database if it still exists, or recreate it if the temporary directory has been cleaned since last execution.

This plugin was more work than I initially though. The need to shutdown the database at JVM shutdown time, to create the database from a modified SQL script, the changes required in order to get HSQL to accepts the SQL script, and some non-obvious issues like slightly different casting of floating point values resulting in exceptions to be throws in SexagesimalConverter, all those issues took time.

Some SQL adjustements were required in order to get the plugin to work. I hope I didn't introducing any error while editing the SQL scripts. However, users are encourages to checks their CRS created by the HSQL plugin against the original EPSG database, or against the CRS created by the epsg-access plugin.
Show
Martin Desruisseaux added a comment - I explored Derby database and came back to HSQL because Derby revealed more difficult to get to work than HSQL (primary/foreigner keys limited to 18 characters; lot of binary files created where HSQL creates just one or two; JAR dependencies 4 times bigger; problematic shutdown procedure in the context of a library; risk of corruption if two JVM use the same database; etc.). The HSQL plugin is bundle with the EPSG database as a SQL script. The first time that the plugin is run, it execute fully the SQL script and cache the result in the following directory:     <the temporary directory>/Geotools/Cached databases Next application launch will reuse this cached database if it still exists, or recreate it if the temporary directory has been cleaned since last execution. This plugin was more work than I initially though. The need to shutdown the database at JVM shutdown time, to create the database from a modified SQL script, the changes required in order to get HSQL to accepts the SQL script, and some non-obvious issues like slightly different casting of floating point values resulting in exceptions to be throws in SexagesimalConverter, all those issues took time. Some SQL adjustements were required in order to get the plugin to work. I hope I didn't introducing any error while editing the SQL scripts. However, users are encourages to checks their CRS created by the HSQL plugin against the original EPSG database, or against the CRS created by the epsg-access plugin.
Hide
Martin Desruisseaux added a comment -
Seems to work properly (no glitch reported). Closing.
Show
Martin Desruisseaux added a comment - Seems to work properly (no glitch reported). Closing.

People

Vote (1)
Watch (0)

Dates

  • Created:
    Updated:
    Resolved: