JRuby

Improve FileStat performance using a java-heap allocated structure

Details

  • Type: Improvement Improvement
  • Status: Closed Closed
  • Priority: Trivial Trivial
  • Resolution: Fixed
  • Affects Version/s: JRuby 1.1RC2
  • Fix Version/s: JRuby 1.1.1
  • Component/s: Extensions
  • Labels:
    None
  • Environment:
    MacOS
  • Patch Submitted:
    Yes
  • Number of attachments :
    3

Description

By changing the way native structures are allocated/read/written, you can boost FileStat performance by a good bit - on MacOS 10.4 with java 1.5, by about 300%.

The attached patch is a cut-down quick&dirty hack of some jffi code to use heap-allocated structures instead of native-memory allocated structures for simple cases where that will work ok (e.g. stat(2), fstat(2)).

Its really only a proof-of-concept, and needs cleaning up by someone, but I thought it was worth putting in an issue in case someone has time to look at it.

  1. HeapStruct+array.patch
    12/Apr/08 12:10 PM
    2 kB
    Wayne Meissner
  2. jruby+HeapStruct.patch
    21/Mar/08 3:24 AM
    17 kB
    Wayne Meissner
  3. unsigned.patch
    14/Apr/08 10:59 PM
    6 kB
    Wayne Meissner

Activity

Hide
Charles Oliver Nutter added a comment -

This looks very promising, for this and for any other structs we might need to work with. And we certainly could use perf improvements for stat.

Show
Charles Oliver Nutter added a comment - This looks very promising, for this and for any other structs we might need to work with. And we certainly could use perf improvements for stat.
Hide
Thomas E Enebo added a comment -

wow...300% is great. Can you specify which issues may remain with this patch that may need changing? Also BaseHeapFileStat is not in the patch.

Show
Thomas E Enebo added a comment - wow...300% is great. Can you specify which issues may remain with this patch that may need changing? Also BaseHeapFileStat is not in the patch.
Hide
Wayne Meissner added a comment -

Updated version

Show
Wayne Meissner added a comment - Updated version
Hide
Wayne Meissner added a comment -

Not sure what may need changing - its mainly that it hasn't been tested much at all - it was just a backport of some stuff I was using to bench jffi vs jna.

A few things come to mind:
1) MacOS/ppc member alignment might be broken.
2) Sparc member alignment might be broken

I'd like to see if anyone else can reproduce the benchmark results too. The delta does not match what I was getting with non-jruby benchmarks.

Stock JRuby (1.1RC2)
10k File.stat(file)
2.444000 0.000000 2.444000 ( 2.444000)
1.939000 0.000000 1.939000 ( 1.939000)
1.936000 0.000000 1.936000 ( 1.936000)
1.929000 0.000000 1.929000 ( 1.929000)
1.942000 0.000000 1.942000 ( 1.942000)
1.980000 0.000000 1.980000 ( 1.980000)
1.947000 0.000000 1.947000 ( 1.947000)
1.937000 0.000000 1.937000 ( 1.937000)
1.936000 0.000000 1.936000 ( 1.937000)
1.937000 0.000000 1.937000 ( 1.937000)

JRuby with HeapStruct:
10k File.stat(file)
0.757000 0.000000 0.757000 ( 0.757000)
0.464000 0.000000 0.464000 ( 0.464000)
0.457000 0.000000 0.457000 ( 0.457000)
0.464000 0.000000 0.464000 ( 0.464000)
0.455000 0.000000 0.455000 ( 0.455000)
0.483000 0.000000 0.483000 ( 0.484000)
0.458000 0.000000 0.458000 ( 0.458000)
0.458000 0.000000 0.458000 ( 0.458000)
0.459000 0.000000 0.459000 ( 0.459000)
0.466000 0.000000 0.466000 ( 0.466000)

Show
Wayne Meissner added a comment - Not sure what may need changing - its mainly that it hasn't been tested much at all - it was just a backport of some stuff I was using to bench jffi vs jna. A few things come to mind: 1) MacOS/ppc member alignment might be broken. 2) Sparc member alignment might be broken I'd like to see if anyone else can reproduce the benchmark results too. The delta does not match what I was getting with non-jruby benchmarks. Stock JRuby (1.1RC2) 10k File.stat(file) 2.444000 0.000000 2.444000 ( 2.444000) 1.939000 0.000000 1.939000 ( 1.939000) 1.936000 0.000000 1.936000 ( 1.936000) 1.929000 0.000000 1.929000 ( 1.929000) 1.942000 0.000000 1.942000 ( 1.942000) 1.980000 0.000000 1.980000 ( 1.980000) 1.947000 0.000000 1.947000 ( 1.947000) 1.937000 0.000000 1.937000 ( 1.937000) 1.936000 0.000000 1.936000 ( 1.937000) 1.937000 0.000000 1.937000 ( 1.937000) JRuby with HeapStruct: 10k File.stat(file) 0.757000 0.000000 0.757000 ( 0.757000) 0.464000 0.000000 0.464000 ( 0.464000) 0.457000 0.000000 0.457000 ( 0.457000) 0.464000 0.000000 0.464000 ( 0.464000) 0.455000 0.000000 0.455000 ( 0.455000) 0.483000 0.000000 0.483000 ( 0.484000) 0.458000 0.000000 0.458000 ( 0.458000) 0.458000 0.000000 0.458000 ( 0.458000) 0.459000 0.000000 0.459000 ( 0.459000) 0.466000 0.000000 0.466000 ( 0.466000)
Hide
Wayne Meissner added a comment -

A small patch to do arrays of struct members.

Show
Wayne Meissner added a comment - A small patch to do arrays of struct members.
Hide
Thomas E Enebo added a comment -

Cool. Many more array paddings in solaris heap struct. I will update to use this new method.

Show
Thomas E Enebo added a comment - Cool. Many more array paddings in solaris heap struct. I will update to use this new method.
Hide
Wayne Meissner added a comment -

Unsigned fields for HeapStruct.

Show
Wayne Meissner added a comment - Unsigned fields for HeapStruct.
Hide
Wayne Meissner added a comment -

This was implemented in jna-posix 0.5 and went out with jruby-1.1.1

Show
Wayne Meissner added a comment - This was implemented in jna-posix 0.5 and went out with jruby-1.1.1
Hide
Timothy Wall added a comment -

I think the performance has more to do with writing the dozens of fields in the stat structure on each function call rather than whether you use com.sun.jna.Memory or NIO buffers. Note that Structure.write is called for every structure argument prior to a function call, and Structure.read called afterward. This makes it easy to use structures without having to always remember to do a write/read.

Since the stat structures already wrap each field with setters and getters, you could probably get similar performance by overriding Structure.write/ to do nothing (to avoid the automatic write) and using Structure.writeField() after each field update.

In some situations, an annotation might be useful to avoid the automatic write or read. For specific performance tuning, it probably comes down to making native reads/writes for only those fields you know are changing.

Maybe the Structure instance itself could have a flag to indicate whether auto-read/write behavior is desired.

Show
Timothy Wall added a comment - I think the performance has more to do with writing the dozens of fields in the stat structure on each function call rather than whether you use com.sun.jna.Memory or NIO buffers. Note that Structure.write is called for every structure argument prior to a function call, and Structure.read called afterward. This makes it easy to use structures without having to always remember to do a write/read. Since the stat structures already wrap each field with setters and getters, you could probably get similar performance by overriding Structure.write/ to do nothing (to avoid the automatic write) and using Structure.writeField() after each field update. In some situations, an annotation might be useful to avoid the automatic write or read. For specific performance tuning, it probably comes down to making native reads/writes for only those fields you know are changing. Maybe the Structure instance itself could have a flag to indicate whether auto-read/write behavior is desired.

People

Vote (0)
Watch (1)

Dates

  • Created:
    Updated:
    Resolved: