GeoTools
  1. GeoTools
  2. GEOT-2050

IndexexShapefileDataStore produces inaccurate results with fid filters

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 2.5-RC0
    • Fix Version/s: 2.5.0
    • Component/s: shapefile plugin
    • Labels:
      None

      Description

      IndexedShapefileDataStore does not handle fid filters correctly.

      Added the below test case which exposes the defect and produces the following failure:
      AssertionFailedError: lacking fids: [statepop.20, statepop.30, statepop.31, statepop.32, statepop.40, statepop.41, statepop.42, statepop.43]. Unexpected ones: [] expected:<49> but was:<41>

          public void testFidFilter() throws Exception {
              URL url;
              {
                  File shpFile = copyShapefiles(STATE_POP);
                  url = shpFile.toURL();
                  String filename = url.getFile();
                  filename = filename.substring(0, filename.lastIndexOf("."));
      
                  // force creating the index
                  File file = new File(filename + ".qix");
      
                  if (file.exists()) {
                      file.delete();
                  }
                  file.deleteOnExit();
              }
              IndexedShapefileDataStore ds = new IndexedShapefileDataStore(url, null, true, true,
                      IndexType.QIX);
              FeatureSource<SimpleFeatureType, SimpleFeature> featureSource = ds.getFeatureSource();
              FeatureCollection<SimpleFeatureType, SimpleFeature> features = featureSource.getFeatures();
              FeatureIterator<SimpleFeature> indexIter = features.features();
      
              Set<String> expectedFids = new HashSet<String>();
              final Filter fidFilter;
              try {
                  FilterFactory2 ff = CommonFactoryFinder.getFilterFactory2(null);
                  Set<FeatureId> fids = new HashSet<FeatureId>();
                  while (indexIter.hasNext()) {
                      SimpleFeature newFeature = indexIter.next();
                      String id = newFeature.getID();
                      expectedFids.add(id);
                      fids.add(ff.featureId(id));
                  }
                  fidFilter = ff.id(fids);
              } finally {
                  indexIter.close();
              }
      
              Set<String> actualFids = new HashSet<String>();
              {
                  features = featureSource.getFeatures(fidFilter);
                  indexIter = features.features();
                  while (indexIter.hasNext()) {
                      SimpleFeature next = indexIter.next();
                      String id = next.getID();
                      actualFids.add(id);
                  }
              }
      
              TreeSet<String> lackingFids = new TreeSet<String>(expectedFids);
              lackingFids.removeAll(actualFids);
              
              TreeSet<String> unexpectedFids = new TreeSet<String>(actualFids);
              unexpectedFids.removeAll(expectedFids);
      
              String lacking = String.valueOf(lackingFids);
              String unexpected = String.valueOf(unexpectedFids);
              String failureMsg = "lacking fids: " + lacking + ". Unexpected ones: " + unexpected;
              assertEquals(failureMsg, expectedFids.size(), actualFids.size());
              assertEquals(failureMsg, expectedFids, actualFids);
          }
      

        Issue Links

          Activity

          Gabriel Roldan made changes -
          Field Original Value New Value
          Link This issue is depended upon by GEOS-2214 [ GEOS-2214 ]
          Gabriel Roldan made changes -
          Status Open [ 1 ] In Progress [ 3 ]
          Hide
          Gabriel Roldan added a comment - - edited
          The issue is with IndexedFidReader.
          The bellow test exposes it. There seems to be an issue with filling the bytebuffer, yet I spent quite a bit of time debugging it and got a bit lost. Yet, changing the buffer size to the record size solves the problem, but I still feel tha's more a workaround than a fix.
          This is the only change in IndexedFidReader:
          {code}
          - buffer = ByteBuffer.allocateDirect(12 * 1024);
          + buffer = ByteBuffer.allocateDirect(IndexedFidWriter.RECORD_SIZE);//allocateDirect(12 * 1024);
          {code}


          Test case for {{IndexedFidReaderTest}} :

          {code}

              public void testFindAllFidsReverseOrder() throws Exception {
                  int expectedCount = 0;
                  Set<String> expectedFids = new TreeSet<String>(Collections.reverseOrder());
                  {
                      IndexedShapefileDataStore ds = new IndexedShapefileDataStore(backshp.toURL(), null,
                              true, true, IndexType.NONE);
                      FeatureSource<SimpleFeatureType, SimpleFeature> featureSource = ds.getFeatureSource();
                      FeatureIterator<SimpleFeature> features = featureSource.getFeatures().features();
                      while (features.hasNext()) {
                          SimpleFeature next = features.next();
                          expectedCount++;
                          expectedFids.add(next.getID());
                      }
                  }

                  assertTrue(expectedCount > 0);
                  assertEquals(expectedCount, reader.getCount());
                  
                  assertFalse("findFid for archsites.5 returned -1",-1 == reader.findFid("archsites.5"));
                  assertFalse("findFid for archsites.25 returned -1",-1 == reader.findFid("archsites.25"));

                  for(String fid : expectedFids){
                      long offset = reader.findFid(fid);
                      assertNotNull(offset);
                      System.out.print(fid + "=" + offset + ", ");
                      assertFalse("findFid for " + fid + " returned -1", -1 == offset);
                  }
              }

          {code}
          Show
          Gabriel Roldan added a comment - - edited The issue is with IndexedFidReader. The bellow test exposes it. There seems to be an issue with filling the bytebuffer, yet I spent quite a bit of time debugging it and got a bit lost. Yet, changing the buffer size to the record size solves the problem, but I still feel tha's more a workaround than a fix. This is the only change in IndexedFidReader: {code} - buffer = ByteBuffer.allocateDirect(12 * 1024); + buffer = ByteBuffer.allocateDirect(IndexedFidWriter.RECORD_SIZE);//allocateDirect(12 * 1024); {code} Test case for {{IndexedFidReaderTest}} : {code}     public void testFindAllFidsReverseOrder() throws Exception {         int expectedCount = 0;         Set<String> expectedFids = new TreeSet<String>(Collections.reverseOrder());         {             IndexedShapefileDataStore ds = new IndexedShapefileDataStore(backshp.toURL(), null,                     true, true, IndexType.NONE);             FeatureSource<SimpleFeatureType, SimpleFeature> featureSource = ds.getFeatureSource();             FeatureIterator<SimpleFeature> features = featureSource.getFeatures().features();             while (features.hasNext()) {                 SimpleFeature next = features.next();                 expectedCount++;                 expectedFids.add(next.getID());             }         }         assertTrue(expectedCount > 0);         assertEquals(expectedCount, reader.getCount());                  assertFalse("findFid for archsites.5 returned -1",-1 == reader.findFid("archsites.5"));         assertFalse("findFid for archsites.25 returned -1",-1 == reader.findFid("archsites.25"));         for(String fid : expectedFids){             long offset = reader.findFid(fid);             assertNotNull(offset);             System.out.print(fid + "=" + offset + ", ");             assertFalse("findFid for " + fid + " returned -1", -1 == offset);         }     } {code}
          Hide
          Gabriel Roldan added a comment -
          Jesse I'm assigning this issue to you to get your attention. If you could please check this out and see if there's a better fix than the one I found that'd be great. Otherwise feel free to just close it.
          Show
          Gabriel Roldan added a comment - Jesse I'm assigning this issue to you to get your attention. If you could please check this out and see if there's a better fix than the one I found that'd be great. Otherwise feel free to just close it.
          Gabriel Roldan made changes -
          Assignee Gabriel Roldán [ groldan ] Jesse Eichar [ jeichar@refractions.net ]
          Hide
          Gabriel Roldan added a comment -
          changing status to resolved while waiting for Jesse's feedback
          Show
          Gabriel Roldan added a comment - changing status to resolved while waiting for Jesse's feedback
          Gabriel Roldan made changes -
          Resolution Fixed [ 1 ]
          Status In Progress [ 3 ] Resolved [ 5 ]
          Hide
          Gabriel Roldan added a comment -
          Closing the issue. Andrea benchmarked the change and it provides a 10 to 20 % speedup in shapefile reading, so that's enough a feedback to consider it fixed
          Show
          Gabriel Roldan added a comment - Closing the issue. Andrea benchmarked the change and it provides a 10 to 20 % speedup in shapefile reading, so that's enough a feedback to consider it fixed
          Gabriel Roldan made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Assignee Jesse Eichar [ jeichar@refractions.net ] Gabriel Roldán [ groldan ]
          Gabriel Roldan made changes -
          Resolution Fixed [ 1 ]
          Status Closed [ 6 ] Reopened [ 4 ]
          Gabriel Roldan made changes -
          Affects Version/s 2.5.0 [ 13698 ]
          Affects Version/s 2.5-RC0 [ 14490 ]
          Fix Version/s 2.5.0 [ 13698 ]
          Fix Version/s 2.5.1 [ 14600 ]
          Gabriel Roldan made changes -
          Resolution Fixed [ 1 ]
          Status Reopened [ 4 ] Closed [ 6 ]
          Andrea Aime made changes -
          Link This issue relates to GEOT-2401 [ GEOT-2401 ]

            People

            • Assignee:
              Gabriel Roldan
              Reporter:
              Gabriel Roldan
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: