History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: LINGO-37
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Unassigned
Reporter: Gray
Votes: 0
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Lingo

Race condition with ResultJoinStrategy causes premature result returns

Created: 13/Oct/06 12:20 PM   Updated: 01/Nov/06 06:25 PM
Component/s: None
Affects Version/s: 1.2.1
Fix Version/s: 1.3

Time Tracking:
Not Specified

File Attachments: 1. Text File lingo.patch (5 kb)
2. Text File lingo.patch (5 kb)
3. File ResultJoinHandler.java.diff (2 kb)
4. Text File ResultJoinHandler.java.patch (1 kb)

Issue Links:
Related
 


 Description  « Hide
There is a subtle race condition in JMSClientInterceptor:

ResultJoinHandler handler = createResultJoinHandler(methodInvocation, metadata);
// send request to topic
requestor.request(destination, requestMessage, handler, getMultipleResponseTimeout());
// race to here if the response comes back fast
RemoteInvocationResult result = handler.waitForResult();
return recreateRemoteInvocationResult(result);

due to a bug in the ResultJoinHandler methods:

public RemoteInvocationResult waitForResult() {
while (true) {
synchronized (lock) {
if (result != null) { return result; }

If the onMessage result could have been already called before we call waitForResult and so result could be != null but the unblockCallerThread could still return false.

I've attached a patch which fixes the problem. Few of the unit tests work for me because of outside dependencies so I'm not sure how to best test this issue.



 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Gray - 13/Oct/06 12:26 PM
I suspect that my patch may fix LINGO-33 as well.

Gray - 13/Oct/06 03:18 PM
Whoops. We may need to synchronize around the boolean. Not sure synchronizing on the lock is enough.

Robert Newson - 26/Oct/06 08:50 PM
I don't think the synchronization code in the class is being tested. If you comment out the notifyAll line, the test suite still passes. That's wrong, right?

Gray - 26/Oct/06 09:08 PM
Better version of the patch which reenables the timeout even if we didn't get all of the results. Also added a log message to track when this occurs.

Gray - 26/Oct/06 09:28 PM
Just uploaded a better patch which fixed some issues with the previous one – especially that it invalidated the timeout.

Gray - 27/Oct/06 12:30 AM
Yet another patch with a better solution. This fixes the original race condition but also adds an unblockAfterTimeout() method to the ResultJoinStrategy interface so the strategy can define the timeout and when to exit with no or partial results better.

james strachan - 27/Oct/06 11:06 AM
Patch applied - many thanks!

Gray - 27/Oct/06 11:39 AM
Okay. So this fixes a small problem with the previous patch. Can someone please remove all of the other patches. A couple are just wrong.

james strachan - 27/Oct/06 12:20 PM
Gray - its getting confusing with 4 patches on the go Could you create another patch off the current SVN trunk so I know which parts are changing after applying the previous patch?

james strachan - 27/Oct/06 12:20 PM
BTW as seen as 1.3 has been released, we should probably reopen another JIRA for turther patches

Gray - 01/Nov/06 06:25 PM
Sorry for the multiple patches. I would have deleted the old ones if I could. If it was going right into 1.3 then I would have reviewed it better.