GeoServer
  1. GeoServer
  2. GEOS-3379

JVM crashes in a multithreaded load test against an ECW coverage

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: 2.0-RC1
    • Fix Version/s: 1.7.7, 2.0-RC2
    • Component/s: WMS
    • Labels:
      None
    • Number of attachments :
      4

      Description

      GeoServer trunk crashes badly and repeatably in a multithread test using 10 threads to access the usual Bluemarble ecw files.
      With JDK 1.6.0_14 and imageio_ext 1.0.3 not even a .hprof file is dumped, it just crashes stating:

      terminate called after throwing an instance of 'std::bad_alloc'
      what(): std::bad_alloc

      I've tried running the same with GeoServer 1.7.x and it lasts a little longer, but eventually crashes as well.
      I'm about to attach the GeoServer logs, the jmeter script, the .hprof generated by 1.7.x crashing

      I've also tried to run the same benchmark against 1.7.5 to make sure it was not due to the rendering rotation patch. It failed the same way, with a hard JVM crash

      1. bluemarble_ecw.jmx
        28 kB
        Andrea Aime
      2. bluemarble.csv
        5 kB
        Andrea Aime
      3. geoserver.log
        40 kB
        Andrea Aime
      4. hs_err_pid32021.log
        54 kB
        Andrea Aime

        Activity

        Hide
        Andrea Aime added a comment -

        The file that led to this crash is the usual world-topo-bathy-200408-3x86400x43200.ecw, renamed to bluemarble_ecw.ecw to avoid problems with the URLs requesting it.
        Simone, do you have that file? It's around 300MB, if you need it I can upload it somewhere

        Show
        Andrea Aime added a comment - The file that led to this crash is the usual world-topo-bathy-200408-3x86400x43200.ecw, renamed to bluemarble_ecw.ecw to avoid problems with the URLs requesting it. Simone, do you have that file? It's around 300MB, if you need it I can upload it somewhere
        Hide
        Andrea Aime added a comment -

        Simone, it seems the above is happeninig only if the OS is in a "low" memory situation, that is, if the test is started with less than 1GB of free memory, with GS already up and whatnot. It seems the extra GB is needed to allow native memory allocations? The JVM usually crashes at load factor 10, it seems very odd that it's not possible to handle 10 concurrent requests with still a few hundreds megabytes of memory available...

        Show
        Andrea Aime added a comment - Simone, it seems the above is happeninig only if the OS is in a "low" memory situation, that is, if the test is started with less than 1GB of free memory, with GS already up and whatnot. It seems the extra GB is needed to allow native memory allocations? The JVM usually crashes at load factor 10, it seems very odd that it's not possible to handle 10 concurrent requests with still a few hundreds megabytes of memory available...
        Hide
        Simone Giannecchini added a comment -

        Can you tell me a bit about the OS and the java options you are using?

        Show
        Simone Giannecchini added a comment - Can you tell me a bit about the OS and the java options you are using?
        Hide
        Andrea Aime added a comment -

        Ubuntu 9.04, 32 bit, 2 cores, 3GB memory, no java options, thus it configures itself to run in server mode, with parallel collector, and access to ... half of the machine memory? Not sure.

        Show
        Andrea Aime added a comment - Ubuntu 9.04, 32 bit, 2 cores, 3GB memory, no java options, thus it configures itself to run in server mode, with parallel collector, and access to ... half of the machine memory? Not sure.
        Hide
        Andrea Aime added a comment -

        Mind the problem can be reproduced only if the machine memory usage is high, close to swap storm.

        Show
        Andrea Aime added a comment - Mind the problem can be reproduced only if the machine memory usage is high, close to swap storm.
        Hide
        Andrea Aime added a comment -

        This is apparently related to http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6463535
        Sun bug... this is bad, they are notoriously slow at fixing bugs...

        Show
        Andrea Aime added a comment - This is apparently related to http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6463535 Sun bug... this is bad, they are notoriously slow at fixing bugs...
        Hide
        Andrea Aime added a comment -

        This issue is reproducable on other machines and breaks the benchmarking tests we're doing for FOSS4G. Simone and Daniele are working on a patch (thanks a ton guys)

        Show
        Andrea Aime added a comment - This issue is reproducable on other machines and breaks the benchmarking tests we're doing for FOSS4G. Simone and Daniele are working on a patch (thanks a ton guys)
        Hide
        Andrea Aime added a comment -

        Tests show that imageio-ext 1.0.4 solves at least the part of the problem found in the imageio-ext component (to actually survive a few benchmark runs one has also to disable jai and the jai imageio jpeg writer).
        When is imageio-ext 1.0.4 coming out?

        Show
        Andrea Aime added a comment - Tests show that imageio-ext 1.0.4 solves at least the part of the problem found in the imageio-ext component (to actually survive a few benchmark runs one has also to disable jai and the jai imageio jpeg writer). When is imageio-ext 1.0.4 coming out?
        Hide
        Andrea Aime added a comment -

        Upped the dependency to 1.0.4. Some issues are still there but they do seem to be out of our control.
        Further investigations are needed in the future

        Show
        Andrea Aime added a comment - Upped the dependency to 1.0.4. Some issues are still there but they do seem to be out of our control. Further investigations are needed in the future

          People

          • Assignee:
            Daniele Romagnoli
            Reporter:
            Andrea Aime
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: