JRuby (please use github issues at http://bugs.jruby.org)
  1. JRuby (please use github issues at http://bugs.jruby.org)
  2. JRUBY-1955

Lucene index search causes major memory bloat + slowdown in 1.1RC1, but not 1.1b1

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: JRuby 1.1RC1
    • Fix Version/s: JRuby 1.1RC2
    • Component/s: Java Integration
    • Labels:
      None
    • Environment:
      Mac OS X 10.5.1, Java 1.5.0_13, Lucene 2.2.0
    • Testcase included:
      yes
    • Number of attachments :
      5

      Description

      JRuby script demonstrating the problem attached. Lucene 2.2.0 JAR also attached.

      1. exploder2.rb
        3 kB
        Bill Dortch
      2. lucene_exploder
        2 kB
        Scott Davies
      1. JRuby1.1.3-Memory.jpg
        75 kB
      2. JRuby1.1RC1-Memory.jpg
        79 kB

        Issue Links

          Activity

          Hide
          Bill Dortch added a comment -

          I'm looking into this as a possible JI (ObjectProxyCache) issue – will assign to myself if that turns out to be the case. Until then, other causes should be considered as well.

          Show
          Bill Dortch added a comment - I'm looking into this as a possible JI (ObjectProxyCache) issue – will assign to myself if that turns out to be the case. Until then, other causes should be considered as well.
          Hide
          Vladimir Sizikov added a comment -

          This does look like ObjectProxyCache issue indeed. I've been profiling the provided example, and see lots of hard references to ObjectProxyCache$SoftEntryRef.

          E.g.:

          Java Local Reference (from java.lang.Thread@0x7a3b5c78) :
          --> org.apache.lucene.index.CompoundFileReader$CSIndexInput@0x788b6340 (56 bytes) (??:)
          --> class org.apache.lucene.index.CompoundFileReader$CSIndexInput (84 bytes) (??:)
          --> org.jruby.util.JRubyClassLoader@0x7a3cd9f8 (63 bytes) (field parent:)
          --> sun.misc.Launcher$AppClassLoader@0x7a3b5a90 (63 bytes) (field classes:)
          --> java.util.Vector@0x7a3ba2d8 (24 bytes) (field elementData:)
          --> [Ljava.lang.Object;@0x7a41c690 (10248 bytes) (Element 171 of [Ljava.lang.Object;@0x7a41c690:)
          --> class org.jruby.internal.runtime.methods.UndefinedMethod (84 bytes) (static field instance:)
          --> org.jruby.internal.runtime.methods.UndefinedMethod@0x7a3d9148 (20 bytes) (field implementationClass:)
          --> org.jruby.MetaClass@0x7a5d4758 (112 bytes) (field runtime:)
          --> org.jruby.Ruby@0x7a3c2c98 (440 bytes) (field javaSupport:)
          --> org.jruby.javasupport.JavaSupport@0x7a3cdd80 (72 bytes) (field objectProxyCache:)
          --> org.jruby.javasupport.JavaSupport$1@0x7a3ceb00 (36 bytes) (field segments:)
          --> [Lorg.jruby.javasupport.util.ObjectProxyCache$Segment;@0x7a3cf618 (72 bytes) (Element 9 of [Lorg.jruby.javasupport.util.ObjectProxyCache$Segment;@0x7a3cf618:)
          --> org.jruby.javasupport.util.ObjectProxyCache$Segment@0x7a3cfd48 (32 bytes) (field entryTable:)
          --> [Lorg.jruby.javasupport.util.ObjectProxyCache$Entry;@0x7e5b6fa0 (8200 bytes) (Element 7 of [Lorg.jruby.javasupport.util.ObjectProxyCache$Entry;@0x7e5b6fa0:)
          --> org.jruby.javasupport.util.ObjectProxyCache$Entry@0x7eeacc88 (24 bytes) (field proxyRef:)
          --> org.jruby.javasupport.util.ObjectProxyCache$SoftEntryRef@0x7eec6990 (36 bytes) 
          

          And here's the most numerous instances list:

          86309 instances of class org.apache.lucene.index.CompoundFileReader$CSIndexInput
          64209 instances of class [B
          63756 instances of class [C
          50785 instances of class java.lang.String
          28901 instances of class org.apache.lucene.index.TermInfo
          28712 instances of class org.jruby.javasupport.util.ObjectProxyCache$SoftEntryRef 
          
          Show
          Vladimir Sizikov added a comment - This does look like ObjectProxyCache issue indeed. I've been profiling the provided example, and see lots of hard references to ObjectProxyCache$SoftEntryRef. E.g.: Java Local Reference (from java.lang.Thread@0x7a3b5c78) : --> org.apache.lucene.index.CompoundFileReader$CSIndexInput@0x788b6340 (56 bytes) (??:) --> class org.apache.lucene.index.CompoundFileReader$CSIndexInput (84 bytes) (??:) --> org.jruby.util.JRubyClassLoader@0x7a3cd9f8 (63 bytes) (field parent:) --> sun.misc.Launcher$AppClassLoader@0x7a3b5a90 (63 bytes) (field classes:) --> java.util.Vector@0x7a3ba2d8 (24 bytes) (field elementData:) --> [Ljava.lang.Object;@0x7a41c690 (10248 bytes) (Element 171 of [Ljava.lang.Object;@0x7a41c690:) --> class org.jruby.internal.runtime.methods.UndefinedMethod (84 bytes) (static field instance:) --> org.jruby.internal.runtime.methods.UndefinedMethod@0x7a3d9148 (20 bytes) (field implementationClass:) --> org.jruby.MetaClass@0x7a5d4758 (112 bytes) (field runtime:) --> org.jruby.Ruby@0x7a3c2c98 (440 bytes) (field javaSupport:) --> org.jruby.javasupport.JavaSupport@0x7a3cdd80 (72 bytes) (field objectProxyCache:) --> org.jruby.javasupport.JavaSupport$1@0x7a3ceb00 (36 bytes) (field segments:) --> [Lorg.jruby.javasupport.util.ObjectProxyCache$Segment;@0x7a3cf618 (72 bytes) (Element 9 of [Lorg.jruby.javasupport.util.ObjectProxyCache$Segment;@0x7a3cf618:) --> org.jruby.javasupport.util.ObjectProxyCache$Segment@0x7a3cfd48 (32 bytes) (field entryTable:) --> [Lorg.jruby.javasupport.util.ObjectProxyCache$Entry;@0x7e5b6fa0 (8200 bytes) (Element 7 of [Lorg.jruby.javasupport.util.ObjectProxyCache$Entry;@0x7e5b6fa0:) --> org.jruby.javasupport.util.ObjectProxyCache$Entry@0x7eeacc88 (24 bytes) (field proxyRef:) --> org.jruby.javasupport.util.ObjectProxyCache$SoftEntryRef@0x7eec6990 (36 bytes) And here's the most numerous instances list: 86309 instances of class org.apache.lucene.index.CompoundFileReader$CSIndexInput 64209 instances of class [B 63756 instances of class [C 50785 instances of class java.lang.String 28901 instances of class org.apache.lucene.index.TermInfo 28712 instances of class org.jruby.javasupport.util.ObjectProxyCache$SoftEntryRef
          Hide
          Bill Dortch added a comment -

          After a lot of testing, here's what I've determined:

          • ObjectProxyCache is, indeed, the source of memory "bloat", but this does not have a serious detrimental impact on performance.
          • Changing ObjectProxyCache to use weak rather than soft references does eliminate the bloat, but does not alter run time significantly. (However, doing so does severely limit the effectiveness of the cache,)
          • The "temporary" files created in temp_lucene_text_index do have a significant impact on performance run-over-run if they are not cleared in between, but that impact is the same in 1.1b1 as in 1.1RC1.

          That last point may be key (and was arrived at the hard way – I actually thought I might have a processor going out on my new computer, as the times just got slower and slower.)

          I used a distilled version of the 'exploder' script (attaching in a moment) that eliminates some extraneous noise. Below are results obtained from 250,000 calls (2500 outer loops x 100 inner loops), with temp files cleared between runs. Results are for 1.1b1, RC1 (OPC/soft refs, released version), RC1 (OPC/weak refs), and RC1 (OPC disabled). You'll note that, though OPC does impose some overhead, it isn't terrible, and it still gets through some 1500 create/query/close cycles per second (on my shiny new, and apparently not yet broken, machine; YMMV). The "low avg" is the average of the first 10,000 calls, per 100 calls, while the "high avg" is the average of the last 10,000 calls, per 100 calls. Note that though performance does degrade a bit over time with the soft references, it's still not bad from the 240,000th call to the end:

          1.1b1:   
            250000 calls, time: 151.281, low avg (per 100): 0.063129, high avg (per 100): 0.060619
          
          1.1RC1 (OPC soft refs, released version)
            250000 calls, time: 166.890, low avg (per 100): 0.063119, high avg (per 100): 0.082649
          
          1.1RC1 (OPC weak refs)
            250000 calls, time: 160.281, low avg (per 100): 0.066089, high avg (per 100): 0.063589
          
          1.1RC1 (OPC disabled)
            250000 calls, time: 153.969, low avg (per 100): 0.063589, high avg (per 100): 0.061249
          

          So: at this point, I don't think the ObjectProxyCache is the source of this major slowdown. As noted above, the run times do increase significantly (50-100%) run over run, if the temp files are not cleared in between, but that is true under 1.1b1 as well as 1.1RC1. Scott: is it possible this is actually what you're seeing? I don't know anything about the application or how the files are used, or whether they are supposed to persist. Please give this a try with my version of the script and let us know if I'm measuring the right behavior.

          -Bill

          Show
          Bill Dortch added a comment - After a lot of testing, here's what I've determined: ObjectProxyCache is, indeed, the source of memory "bloat", but this does not have a serious detrimental impact on performance. Changing ObjectProxyCache to use weak rather than soft references does eliminate the bloat, but does not alter run time significantly. (However, doing so does severely limit the effectiveness of the cache,) The "temporary" files created in temp_lucene_text_index do have a significant impact on performance run-over-run if they are not cleared in between, but that impact is the same in 1.1b1 as in 1.1RC1. That last point may be key (and was arrived at the hard way – I actually thought I might have a processor going out on my new computer, as the times just got slower and slower.) I used a distilled version of the 'exploder' script (attaching in a moment) that eliminates some extraneous noise. Below are results obtained from 250,000 calls (2500 outer loops x 100 inner loops), with temp files cleared between runs . Results are for 1.1b1, RC1 (OPC/soft refs, released version), RC1 (OPC/weak refs), and RC1 (OPC disabled). You'll note that, though OPC does impose some overhead, it isn't terrible, and it still gets through some 1500 create/query/close cycles per second (on my shiny new, and apparently not yet broken, machine; YMMV). The "low avg" is the average of the first 10,000 calls, per 100 calls , while the "high avg" is the average of the last 10,000 calls, per 100 calls . Note that though performance does degrade a bit over time with the soft references, it's still not bad from the 240,000th call to the end: 1.1b1: 250000 calls, time: 151.281, low avg (per 100): 0.063129, high avg (per 100): 0.060619 1.1RC1 (OPC soft refs, released version) 250000 calls, time: 166.890, low avg (per 100): 0.063119, high avg (per 100): 0.082649 1.1RC1 (OPC weak refs) 250000 calls, time: 160.281, low avg (per 100): 0.066089, high avg (per 100): 0.063589 1.1RC1 (OPC disabled) 250000 calls, time: 153.969, low avg (per 100): 0.063589, high avg (per 100): 0.061249 So: at this point, I don't think the ObjectProxyCache is the source of this major slowdown. As noted above, the run times do increase significantly (50-100%) run over run, if the temp files are not cleared in between , but that is true under 1.1b1 as well as 1.1RC1. Scott: is it possible this is actually what you're seeing? I don't know anything about the application or how the files are used, or whether they are supposed to persist. Please give this a try with my version of the script and let us know if I'm measuring the right behavior. -Bill
          Hide
          Bill Dortch added a comment -

          Adding distilled exploder2 script.

          Show
          Bill Dortch added a comment - Adding distilled exploder2 script.
          Hide
          Charles Oliver Nutter added a comment -

          So where do we stand? This is a blocker and a half for RC2, and we'd like to push out RC2 this week.

          Show
          Charles Oliver Nutter added a comment - So where do we stand? This is a blocker and a half for RC2, and we'd like to push out RC2 this week.
          Hide
          Charles Oliver Nutter added a comment -

          Oops, I hadn't updated this page in a day or so...catching up now...

          Show
          Charles Oliver Nutter added a comment - Oops, I hadn't updated this page in a day or so...catching up now...
          Hide
          Charles Oliver Nutter added a comment -

          Ok, caught up.

          I feel your pain Bill...the lack of something between weak and soft is an unfortunate complication. We've had the same discussion relating to other caches.

          It seems that in many places we need to have a cache that can automatically age entries out over time, using soft references until then so if they become dereferenced they'll go away anyway. Perhaps something that periodically replaces soft entries with weak ones? Something like your vulture thread but smarter?

          Unfortunately I know of know such existing data structure in JDK, and I'm not sure I'm up for writing one at the moment. But do you think this would help our situation?

          A couple questions:

          • What are the soft refs referencing? If it's the actual Java objects, wouldn't weak refs be fine, since they'd still only be collected once the original objects go away?
          • If not, can't this be implemented that way? If the original objects go away we should let the proxies go away as soon as possible.

          So perhaps a mix of strategies: if the original object goes away, it should trigger an immediate dereference of the proxy. Until then...soft?

          I think the temp files could be related to this issue, but since the issue also discusses the fact that memory increases to the maximum allowed we should also try to remedy that. Just calling into Java integration a lot should not cause the heap to grow to maximum.

          Show
          Charles Oliver Nutter added a comment - Ok, caught up. I feel your pain Bill...the lack of something between weak and soft is an unfortunate complication. We've had the same discussion relating to other caches. It seems that in many places we need to have a cache that can automatically age entries out over time, using soft references until then so if they become dereferenced they'll go away anyway. Perhaps something that periodically replaces soft entries with weak ones? Something like your vulture thread but smarter? Unfortunately I know of know such existing data structure in JDK, and I'm not sure I'm up for writing one at the moment. But do you think this would help our situation? A couple questions: What are the soft refs referencing? If it's the actual Java objects, wouldn't weak refs be fine, since they'd still only be collected once the original objects go away? If not, can't this be implemented that way? If the original objects go away we should let the proxies go away as soon as possible. So perhaps a mix of strategies: if the original object goes away, it should trigger an immediate dereference of the proxy. Until then...soft? I think the temp files could be related to this issue, but since the issue also discusses the fact that memory increases to the maximum allowed we should also try to remedy that. Just calling into Java integration a lot should not cause the heap to grow to maximum.
          Hide
          Vladimir Sizikov added a comment -

          I'm actually seeing quite noticeable performance impact between JRuby 1.0.3 and JRuby 1.1RC1, with the exploder2.rb script by Bill.
          Also, if you enable the GC logging, you'll see that the GC works MUCH harder whith JRuby 1.1RC1, with long full GC cycles that are much more often.

          On my Vista PC, the time difference between 1.0.3 and 1.1RC1 can be up to 100% (JRuby 1.1RC is slower), but typically in a range of 50%.

          Setting the -Xmx to lower limit, shows this clearly:

          D:\work\jruby-dev>jruby-1.0.3\bin\jruby -J-Xmx16m exploder2.rb
          starting loops
          50000 calls, time: 55.432, low avg (per 100): 0.11227999999999999, high avg (per 100): 0.11087000000000001
          
          D:\work\jruby-dev>jruby -J-Xmx16m exploder2.rb
          starting loops
          50000 calls, time: 81.622, low avg (per 100): 0.16287000000000001, high avg (per 100): 0.16232
          
          D:\work\jruby-dev>jruby-1.0.3\bin\jruby -J-Xmx10m exploder2.rb
          starting loops
          50000 calls, time: 90.44200000000001, low avg (per 100): 0.17804000000000006, high avg (per 100): 0.1803800000
          0000004
          
          D:\work\jruby-dev>jruby -J-Xmx10m exploder2.rb
          starting loops
          50000 calls, time: 132.912, low avg (per 100): 0.2650699999999999, high avg (per 100): 0.26774
          

          Looking at the memory graphs it JConsole, I see that 1.1RC quickly consumes all available 400Megs of heap, while JRuby 1.0.3 stays within 20 megs.

          Show
          Vladimir Sizikov added a comment - I'm actually seeing quite noticeable performance impact between JRuby 1.0.3 and JRuby 1.1RC1, with the exploder2.rb script by Bill. Also, if you enable the GC logging, you'll see that the GC works MUCH harder whith JRuby 1.1RC1, with long full GC cycles that are much more often. On my Vista PC, the time difference between 1.0.3 and 1.1RC1 can be up to 100% (JRuby 1.1RC is slower), but typically in a range of 50%. Setting the -Xmx to lower limit, shows this clearly: D:\work\jruby-dev>jruby-1.0.3\bin\jruby -J-Xmx16m exploder2.rb starting loops 50000 calls, time: 55.432, low avg (per 100): 0.11227999999999999, high avg (per 100): 0.11087000000000001 D:\work\jruby-dev>jruby -J-Xmx16m exploder2.rb starting loops 50000 calls, time: 81.622, low avg (per 100): 0.16287000000000001, high avg (per 100): 0.16232 D:\work\jruby-dev>jruby-1.0.3\bin\jruby -J-Xmx10m exploder2.rb starting loops 50000 calls, time: 90.44200000000001, low avg (per 100): 0.17804000000000006, high avg (per 100): 0.1803800000 0000004 D:\work\jruby-dev>jruby -J-Xmx10m exploder2.rb starting loops 50000 calls, time: 132.912, low avg (per 100): 0.2650699999999999, high avg (per 100): 0.26774 Looking at the memory graphs it JConsole, I see that 1.1RC quickly consumes all available 400Megs of heap, while JRuby 1.0.3 stays within 20 megs.
          Hide
          Vladimir Sizikov added a comment -

          Pretty pictures for those who into these things.

          Please note the huge difference in "GC Time" results.

          Also, of interest, JRuby 1.1RC1 reached 1 GIG of heap easily, while JRuby 1.0.3 is within 100mb or so.

          Show
          Vladimir Sizikov added a comment - Pretty pictures for those who into these things. Please note the huge difference in "GC Time" results. Also, of interest, JRuby 1.1RC1 reached 1 GIG of heap easily, while JRuby 1.0.3 is within 100mb or so.
          Hide
          Vladimir Sizikov added a comment -

          Memory graph for JRuby 1.1RC1 running exploder2.rb script. This run was about 50% slower than JRuby 1.1.3, and GC time shows almost a minute!

          Show
          Vladimir Sizikov added a comment - Memory graph for JRuby 1.1RC1 running exploder2.rb script. This run was about 50% slower than JRuby 1.1.3, and GC time shows almost a minute!
          Hide
          Bill Dortch added a comment -

          Committed switch to OPC weak references in 5606. This limits the effectiveness of the cache; see forthcoming note at JRUBY-199. Leaving this issue open pending verification that switch to weak refs alone solves the problem.

          Show
          Bill Dortch added a comment - Committed switch to OPC weak references in 5606. This limits the effectiveness of the cache; see forthcoming note at JRUBY-199 . Leaving this issue open pending verification that switch to weak refs alone solves the problem.
          Hide
          Vladimir Sizikov added a comment -

          I can confirm that after the latest Bill's change, the memory consumption is back to normal, and memory is not constantly growing anymore.

          Also, exploder2 runs with the latest JRuby are now faster than JRuby 1.0.3 ones.

          Show
          Vladimir Sizikov added a comment - I can confirm that after the latest Bill's change, the memory consumption is back to normal, and memory is not constantly growing anymore. Also, exploder2 runs with the latest JRuby are now faster than JRuby 1.0.3 ones.
          Hide
          Charles Oliver Nutter added a comment -

          Fixed by Bill by making the proxy cache use weak references. I'll file a separate bug to address the efficacy of the cache.

          Show
          Charles Oliver Nutter added a comment - Fixed by Bill by making the proxy cache use weak references. I'll file a separate bug to address the efficacy of the cache.
          Hide
          Charles Oliver Nutter added a comment -

          Linking to the cache efficacy bug.

          Show
          Charles Oliver Nutter added a comment - Linking to the cache efficacy bug.

            People

            • Assignee:
              Charles Oliver Nutter
              Reporter:
              Scott Davies
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: