Details
-
Type:
Bug
-
Status:
Resolved
-
Priority:
Critical
-
Resolution: Fixed
-
Affects Version/s: JRuby 1.7.0.pre1
-
Fix Version/s: JRuby 1.7.1
-
Component/s: Core Classes/Modules
-
Labels:None
-
Environment:Microsoft Windows Vista, but these bugs seem to be also in JRuby 1.6.6, and I suspect they may also be in earlier versions of JRuby.
-
Number of attachments :3
Description
This deals with two possible "critical" issues in RubyFixnum.java,
and also suggests a possible improvement to related code which is working.
(As with my submission in JRUBY-6612, my apologies if I'm raising something
which you are already aware of, or if I'm looking at the wrong code,
or making another error which invalidates this bug report.
And I'd be happy to (try to!) answer any questions.)
I think there are some (critical?) bugs in
- private IRubyObject idivLong: RubyFixnum.java line 566;
- private IRubyObject divmodFixnum: RubyFixnum.java line 635;
The problems are very similar to those I raised inJRUBY-6612for: - public IRubyObject op_mul: RubyFixnum.java line 491.
Using jirb from JRUBY 1.6.6
v = 256
v *= v #=> 65536
v *= v * v * (v / -2) #=> -9223372036854775808 == Long.MIN_VALUE
v / -1 #=> -9223372036854775808
v.div(-1) #=> -9223372036854775808
v.divmod(-1) #=> [-9223372036854775808, 0]
- This next line is also a problem in JRuby 1.6.6, but following
JRUBY-6612has, I think, been fixed in JRuby 1.7.0.preview1.
-1 * v #=> -9223372036854775808 # wrong
v * -1 #=> 9223372036854775808 # correct
Using jirb from JRUBY 1.7.0.preview1 gives the same results
except that "-1 * v" now gives the correct 9223372036854775808.
The problem seems to be that some long overflows aren't being detected.
I suggest patches in the attached file which:
(a) make what I think are appropriate checks for overflows;
(b) change the code to avoid using both "/" and "%", to speed things up.
I've put this as critical because when I raised a similar problem
for "public IRubyObject op_mul" (see JRUBY-6612) I called it "major"
and it was upgraded to "critical".
-
- jruby-fixnum-bug.txt
- 14/Jul/12 10:52 PM
- 11 kB
- Colin Bartlett
-
- JRUBY-smalltest-jruby-qqq-JRuby-AWKWARD-ERROR-MESSAGE.png
- 43 kB
- 30/Oct/12 8:48 AM
-
- smalltest-jruby-qqq-JRuby-AWKWARD-ERROR-MESSAGE.png
- 43 kB
- 30/Oct/12 8:27 AM
Activity
I've had a go (as at 2012-07-22_Sunday_t_20:50 UTC) at forking, cloning, changing, pushing, then pulling, but I managed to fail somewhere between cloning and pulling, it seems at the pushing the changes back stage. So I'll put a copy of the whole of my three changes to RubyFixnum.java here:
https://gist.github.com/3161035
If that doesn't work, I can try emailing with an attachment!
If it helps, I could try to write some tests for how I think the JRuby Fixnum class code should be working for these bugs, but I'll need a link to the existing test(s) for the JRuby Fixnum class so I can copy/steal what's there and adapt it.
Confession time: about two years ago I reported a bug in passing integer arguments between JRuby and Java, which you (Charles) fixed, and asked me if I could set up some tests, with some links to some example tests. I said I'd try, but completely failed to understand how to apply the linked examples to the bug in question, and left it rather too long to reply saying I'd failed without acute embarrassment. So even more embarrassment by confessing and apologising now. But if I don't think I can apply any links to (preferably fairly relevant) example tests then on this occasion I will say so very quickly!
Another try at forking, cloning, changing, pushing, then pulling at about 2012-07-23_Monday_04:00_UTC might have worked! But I'd be very cautious if I was anyone using it in case I didn't actually do what I was trying to do. (Or didn't quite patch in what I intended to patch in.)
I've combined the "pull" for this with my suggested patch for another possible long integer overflow bug in Fixnum#step, reported as JRUBY-6790.
Mentioned on the other bug that I commented on the pull request. Make sure your local copy (with patches in place) can build and pass tests with "ant test". Otherwise, they look ok so far.
The examples given here are still an issue, though we've patched a few other places (with Colin's help).
Since these types of integer overflows are different but also related, I've been wondering whether it might be an idea to have a "bug" report for each one separately, but also one place where they are all looked at together.
For example this problem in JRuby:
n = 2**64 # creates a JRuby Bignum
rng = 0 .. n *
rng.step
works as expected in 1.9 but raises an exception in ruby 1.8.7.
The problem is in RangeError.java in
private IRubyObject stepCommon(ThreadContext context, IRubyObject step, Block block)
final Ruby runtime = context.getRuntime();
long unit = RubyNumeric.num2long(step);
I haven't checked to see if this is already the subject of a bug report, but will do so later today.
Since this raises an exception if "step" is a Bignum it's nowhere near as much of a potential problem as other integer overflow issues, and a patch would be something like the code in
private IRubyObject stepCommon19(ThreadContext context, IRubyObject step, Block block)
But there is, I think, a possibility of consolidating some of these methods with JRUBY Java 64bit integer overflow issues, and bearing in mind Charles Nutter's comment (I can't find where alas) about trying to keep bits of similar code in the same place, I was thinking of suggesting of moving all - or at any rate most - of the Java code for Fixnum "each" and "step" methods into RubyFixnum.java. That means:
RubyRange.java
private void fixnumEach(ThreadContext context, Ruby runtime, Block block)
private void fixnumStep(ThreadContext context, Ruby runtime, long unit, Block block)
RubyNumeric.java
private static void fixnumStep(ThreadContext context, Ruby runtime, long value, long end, long diff, Block block)
RubyInteger.java
private static void fixnumUpto(ThreadContext context, long from, long to, Block block)
private static void fixnumDownto(ThreadContext context, long from, long to, Block block)
RubyInteger.java: maybe leave this in RubyInteger.java as the suggested patch avoids referring to Long.MAX_VALUE
public IRubyObject succ(ThreadContext context)
Any thoughts?
Colin: Still out there?
We've incorporated the fixes you felt good about, but there's obviously still issues. Perhaps you can update us on where you stand patching them?
You are also correct in thinking it might be easier to track these with separate bugs for separate issues. This one kinda grew into a bigger ball of crap than we expected, so it's hard to know when we can resolve it.
Following my comment that "these types of integer overflows are different but also related" I've been working on looking at the RubyFixnum.java code (and where Java Long integers are used in RubyInteger.java, RubyNumeric.java, RubyRange.java) to find, categorise - and try to fix - the issues.
I'm producing a summary (in a single text file) which has the issues I've found with suggestions for patching these (even where the patch has already been applied), which I'd like you and others to have a look at: I think I now have a reasonable knowledge of how these methods work "locally" - all the issues I've found recently have been from me looking at the Java code, not from tests not working as I'd expect: the tests showing the problems have been written after finding the issues - but I have very little feel for how JRuby works "globally" so my suggested patches might work for the tests I can concoct for them, but might cause problems in other parts of JRuby. (From my limited knowledge of how Java works I think that actually my suggested patches won't cause problems in other parts of JRuby, but I think I would be very unwise to assert that as a fact.)
The idea is that we can use that single file summary to then (where necessary) set up separate bug reports.
I hope to finish that in the next two or three days. (If I don't manage that I'll post a progress update here.)
I think there are basically two types of issues:
1. Issues which aren't Fixnum loops: these are, I think, mostly separate, although they are obviously related in that the underlying issue is not detecting integer overflow.
2. Integer loops: Fixnum#upto, Fixnum#downto, Fixnum#step, Range#each (begin, end both Fixnum), Range#step (begin, end, step value all Fixnums.
What I'm suggesting for these is that there be these static methods in RubyFixnum.java:
2.1. fixnumStep: used by Fixnum#step, Range#step;
2.2. fixnumDownto: used by Fixnum#downto;
2.3. fixnumUpto: used by Fixnum#upto, Range#each;
There is also Fixnum#times which as written doesn't have any integer overflow issues, but which could maybe also use a fixnumUpto method.
I'm writing this from memory, so apologies for any errors in the above or in this next statement: some of these looping methods "yield" the loop values even if there isn't an associated block, others only yield the loop values if there is an associated block which has a "block parameter": if there isn't a "block parameter" they set up a "nil" value and yield that. I assume this is for reasons of speed by avoiding creating a new Fixnum object when you don't need to. (Is that correct?)
One reason for having just the three underlying fixnum loop constructs is to be consistent about whether or not to make that distinction: I think it makes sense for Fixnum#times, but for all the others I think the most likely use case is that there will be a "block parameter"?
I did also wonder whether for Fixnum#step and Range#step it might be useful to test for whether the step value is 1, and if it is to call fixnumUpto instead of fixnumStep, but some very rough benchmarking suggested that for smallish loops the overhead of making a check for that was more than the time saved by fixnumUpto with a "i++" increment, which does seem to be significantly faster for very big loops.
But I'm distrustful of benchmarks (particularly those made by me), and I'd prefer an opinion by someone who has an understanding of the Java runtime engines and can say when what might be a reasonable way to speed something up is - or is not - reasonably likely to do so.
Colin: Checking on your progress, since it has been more than two or three days ![]()
Answers to questions...
- Your plan of attack sounds fine; utility methods for step, downto, and upto would all be appropriate.
- Re: yielding nil when there's no argument to the block: yes, this is done for perf to avoid creating the Fixnum object.
- Re: making this consistent: I agree, it should be consistent. Obviously the rollover cases are less likely in the #times case, but for #upto and #downto they could straddle the rollover size.
- Re: perf and checking for simple cases: Perhaps worthwhile, but not a high priority. In most cases, the Fixnum construction will outweigh any minor perf tweaks to the logic. We should keep it in mind, though, and once we have correctness solved we can explore perf improvements.
Looking forward to patches from you! I'm going to bump this to 1.7.1 for now, since we're very close to 1.7.0 final and nobody has reported rollover issues but you.
I've put a fuller reply on SkyDrive because I think it might be a bit long for a comment here.
>> Colin: Checking on your progress, since it has been more than two or three days
My apologies: I was too optimistic about the two or three days, and then kept on being optimistic it would only be another day or two: sorry.
I finished the single text file summary this weekend, and checked it: that doesn't mean I haven't made any errors in it, but it does mean that I have made a big effort to avoid errors. (Admittedly not quite the same thing.)
I've posted it here, along with a "please-read-first-jruby-iterator-issues.txt" file:
https://skydrive.live.com/#cid=01F6D7100ADB0423
(Let me know via Twitter @colinb8 if the link doesn't work or you can't get at that file.)
I suggest downloading that summary, but it's longer than I intended - I was trying to be comprehensive about the problems. So you might want to just look at the "contents" section for now, and (for now) rely on what I say in the
"please-read-first-jruby-iterator-issues.txt" file.
I'll put my patches on that skydrive public folder: I'll clean them up and send you a Twitter message when I've done that.
Feel free to email me directly at colinwb@yahoo.co.uk: it might be a good idea to tweet me at Twitter@colinb8 the first time (if you do that) just in case Yahoo is insensitive enough to put an email in the Spam folder.
example of error message I'm sometimes - and sometimes not - getting
1. My patches for JRuby Fixnum iteration Java long integer overflow problems seem to be working: they pass the tests I've devised to show these problems.
2. BUT: I've tried two types of patching, both on preview1 source (I haven't been able to successfully compile preview2):
2.a. patch all the existing methods in RubyInteger.java, RubyNumeric.java, RubyRange.java;
2.b. set up new generic public static methods in RubyFixnum.java, and call those from the relevant methods inRubyInteger.java, RubyNumeric.java, RubyRange.java.
2.1. Doing 2.b I've added large numbers of "ordinary" Fixnum iterations, which shouldn't fail.
2.2. Sometimes they don't fail, but sometimes they do, giving this type of error:
...java.lang.NoClassDefFoundError: org/jruby/runtime/ThreadContext...
2.3. When the failure has occurred, I've immediately rerun the same Ruby program with the same JRuby executable jar that I compiled with my patches, and sometimes it fails at the same point, sometimes at a rather different point.
2.4. I'll try to narrow this problem down: I'm concerned that I can't seem to properly compile JRuby java code.
2.5. But in the meantime I think the safest way to patch is using type 2.a patches. BUT I haven't yet tried to get the types of errors I'm getting with type 2.b patches: I'll have a go at that.
3. Trying to find out what's happening here, I found something else which might affect the patches: it looks as though in my compiled jar files "block.getBody().getArgumentType()" might be returning integer 2 even if there is no block or if the block just has one variable argument. I've out the following type of "debugging" information code in RubyArray.java and RubyInteger.java
in places where the test "if (block.getBody().getArgumentType() == BlockBody.ZERO_ARGS)" is made,
and limited testing seems to show that my comment is true for my system,
compiling both preview2 and also jruby.1.6.7.2
System.out.println("at RubyXXX.java:999:argType=" + //+
block.getBody().getArgumentType() + " ZERO_ARGS=" + BlockBody.ZERO_ARGS); //+
if (block.getBody().getArgumentType() == BlockBody.ZERO_ARGS) {
then calling, for example:
Array.new(2)
Array.new(2)
{ |v| v }Charles - when you have a moment, could you try the same sort of thing to see if it's just a problem with my setup?
Thanks.
Additional fixes:
commit ee963e52200617634eb11d9ffbf984b956f7fb21
Author: Charles Oliver Nutter <headius@headius.com>
Date: Thu Nov 8 13:44:42 2012 -0600
Fix JRUBY-6612, JRUBY-6777, JRUBY-6778, JRUBY-6779, JRUBY-6790
[JRUBY-6612] some problems with JRuby seeming to not detect Java Long arithmetic overflows
[JRUBY-6777] RubyFixnum.java - two methods fail to detect some long integer overflows
[JRUBY-6778] Possible long integer overflow bug in Integer#succ in RubyInteger.java
[JRUBY-6779] Strange behaviour of some Integer Ranges with Range#each - maybe an integer overflow problem?
[JRUBY-6790] Possible long integer overflow in fixnumStep in RubyNumeric.java
Patches by Colin Bartlett. Thank you!
:100644 100644 93f83c8... 109b856... M src/org/jruby/RubyInteger.java
:100644 100644 b631a40... 7ecf098... M src/org/jruby/RubyNumeric.java
:100644 100644 3ed0b55... d15c2ed... M src/org/jruby/RubyRange.java
Very thorough report, thank you! I quickly scanned your patches and they look good. I'll see about integrating and testing them tomorrow. FWIW, if you are able to do them as a pull request on Github, it would make my job of patching and testing easier.
In any case, thanks for finding and fixing these items. Interesting the things that escape being reported for so long