History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: JRUBY-2048
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Charles Oliver Nutter
Reporter: Charles Oliver Nutter
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
JRuby

Call site caches may not be thread safe

Created: 29/Jan/08 01:01 PM   Updated: 06/Oct/08 10:14 PM
Component/s: Compiler
Affects Version/s: None
Fix Version/s: JRuby 1.1.5

Time Tracking:
Not Specified


 Description  « Hide
The call site cache we have currently, InlineCachedCallSite, may not be thread-safe. In general, it will not break in any catastrophic way. However it is the case that if two threads attempt to call against the same site at the same time, and they're both calling against the same type, and while they're calling against that type the type is changed by a third thread, and the last thread to win when updating the call site updates it with old information.

Or a simpler case that springs to mind: while a thread is in the call site, the type in question changes; because the call site has not yet registered itself with the type it does not get flushed out by the change, and so we end up with a call site caching the old method forever (or until the next rebinding of that name).

This may indicate a key flaw in the push logic for changes, requiring either synchronization as part of cache updating or requiring more guards in the call sites themselves (for example, guards to flush the call site if a class serial number has changed since the last cache).



 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Charles Oliver Nutter - 15/Feb/08 12:57 PM
Punting issues from 1.1 RC2 to 1.1 final.

Charles Oliver Nutter - 17/Mar/08 12:02 PM
We've still never seen a case of this in the wild, and since the impact is pretty low and pretty hard to cause, I'm punting to post 1.1. We'll look at a bigger cleanup of call site logic post 1.1.

Charles Oliver Nutter - 01/May/08 07:43 PM
A fix for a similar problem was done recently, where multiple threads calling against the same site would step on each other if the site was polymorphic. But this issue remains.

Charles Oliver Nutter - 23/May/08 04:21 AM
Since the only likely fix for this is to move from our current cache-flushing mechanism to one involving class serial numbers, and Tom's first attempt at that proved to be too invasive and complicated for 1.1.2, punting this to 1.1+. If anyone sees an issue that appears to be related to this potential thread unsafey, we will bump the priority up to blocker for the next upcoming release.

Charles Oliver Nutter - 06/Oct/08 10:14 PM
This is now fixed. I put in place two things to solve this:
  • A generation/serial number on every class instance, which propagates changes down when any change occurs
  • Call sites now use that generation/serial number to invalidate the caches

There was a slight performance degradation for monomorphic (cacheable) calls, but ultimately a large improvement for polymorphic calls and other lookup cases like respond_to.