JRuby

Java List to Ruby Array conversion ~100% slower than primitive array conversion

Details

  • Type: Improvement Improvement
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: JRuby 1.3.1, JRuby 1.4
  • Fix Version/s: JRuby 1.4
  • Component/s: Performance
  • Labels:
    None
  • Environment:
    MacOSX 10.5.8
    java version "1.6.0_13"
    Java(TM) SE Runtime Environment (build 1.6.0_13-b03-211)
    Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02-83, mixed mode)
  • Testcase included:
    yes
  • Number of attachments :
    1

Description

On both jRuby 1.3.1 and jRuby 1.4 (edge), I found that converting Java Lists to Ruby Array actually performs worse than converting the Java List to a Java Array to a Ruby Array. Performance of the 2-step conversion is up to 50% faster.

Here are the results

$ jruby -v      
jruby 1.4.0dev (ruby 1.8.7p174) (2009-08-23 6f772bf) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_13) [x86_64-java]
$ jruby --server ./bench.rb 
Rehearsal -----------------------------------------------------
        list.to_a   2.996000   0.000000   2.996000 (  2.841000)
list.toArray.to_a   1.821000   0.000000   1.821000 (  1.821000)
       array.to_a   0.772000   0.000000   0.772000 (  0.772000)
-------------------------------------------- total: 5.589000sec

                        user     system      total        real
        list.to_a   1.875000   0.000000   1.875000 (  1.875000)
list.toArray.to_a   0.916000   0.000000   0.916000 (  0.916000)
       array.to_a   0.760000   0.000000   0.760000 (  0.760000)

Knowing this was really helpful in getting my Google App Engine application to stop hitting the request time limit when loading arrays from the Datastore

Activity

Hide
Nick Sieger added a comment -

Thanks for the find and the suggestion. I committed a fix in c915adf, please have a look.

Show
Nick Sieger added a comment - Thanks for the find and the suggestion. I committed a fix in c915adf, please have a look.
Hide
Michael Rykov added a comment -

Yep, that works. I was trying to find the actual algorithm that converts a Collection to a Ruby Array, but couldn't - maybe that part is inefficient for other types of conversions too.

New results:

Rehearsal -----------------------------------------------------
        list.to_a   2.296000   0.000000   2.296000 (  2.157000)
list.toArray.to_a   1.038000   0.000000   1.038000 (  1.038000)
       array.to_a   0.968000   0.000000   0.968000 (  0.968000)
-------------------------------------------- total: 4.302000sec

                        user     system      total        real
        list.to_a   0.909000   0.000000   0.909000 (  0.909000)
list.toArray.to_a   0.943000   0.000000   0.943000 (  0.943000)
       array.to_a   0.729000   0.000000   0.729000 (  0.729000)
Show
Michael Rykov added a comment - Yep, that works. I was trying to find the actual algorithm that converts a Collection to a Ruby Array, but couldn't - maybe that part is inefficient for other types of conversions too. New results:
Rehearsal -----------------------------------------------------
        list.to_a   2.296000   0.000000   2.296000 (  2.157000)
list.toArray.to_a   1.038000   0.000000   1.038000 (  1.038000)
       array.to_a   0.968000   0.000000   0.968000 (  0.968000)
-------------------------------------------- total: 4.302000sec

                        user     system      total        real
        list.to_a   0.909000   0.000000   0.909000 (  0.909000)
list.toArray.to_a   0.943000   0.000000   0.943000 (  0.943000)
       array.to_a   0.729000   0.000000   0.729000 (  0.729000)
Hide
Charles Oliver Nutter added a comment -

I wanted to do a little more investigation to figure out exactly why it's slow. It turns out that we never had a custom to_a impl for Collection, instead falling back on Enumerable's to_a which basically creates an array and then collects elements from eaching. So it was slower for a couple reasons:

  • "each" is implemented in Ruby by walking java.util.Iterator hasNext/next. So there's two JI calls for every loop, not to mention that it's implemented in Ruby code
  • The block passed to each is not exactly optimal, creating an IRubyObject[] to wrap every element argument from each
  • It grows the array from its initial default size, rather than starting with an exactly-sized array

The modified logic ends up simplifying things because it first turns it into a native array, and then the native array's to_a logic is all implemented in Java and fairly optimal (I made some major improvements recently.

This is something to keep in mind since we still fall back on suboptimal default Enumerable implementations for many methods that could be improved...and I will create a story for that.

Show
Charles Oliver Nutter added a comment - I wanted to do a little more investigation to figure out exactly why it's slow. It turns out that we never had a custom to_a impl for Collection, instead falling back on Enumerable's to_a which basically creates an array and then collects elements from eaching. So it was slower for a couple reasons:
  • "each" is implemented in Ruby by walking java.util.Iterator hasNext/next. So there's two JI calls for every loop, not to mention that it's implemented in Ruby code
  • The block passed to each is not exactly optimal, creating an IRubyObject[] to wrap every element argument from each
  • It grows the array from its initial default size, rather than starting with an exactly-sized array
The modified logic ends up simplifying things because it first turns it into a native array, and then the native array's to_a logic is all implemented in Java and fairly optimal (I made some major improvements recently. This is something to keep in mind since we still fall back on suboptimal default Enumerable implementations for many methods that could be improved...and I will create a story for that.

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated:
    Resolved: