Issue Details (XML | Word | Printable)

Key: RVM-352
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Blocker Blocker
Assignee: Ian Rogers
Reporter: Ian Rogers
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
RVM

Running many iterations of _200_check fails floating point remainder test

Created: 28/Nov/07 08:05 AM   Updated: 26/Apr/08 12:43 PM
Component/s: Compiler: Optimizing, Instruction Architecture: Intel
Affects Version/s: None
Fix Version/s: 2.9.3

Time Tracking:
Not Specified

Issue Links:
Related
 


 Description  « Hide
SPEC JVM 98's _200_check is failing when run with -X:aos:initial_compiler=opt -X:irc:O2 -X:aos:enable_recompilation=false . The output looks like:

checkRemainders: long Failed: -10.5 % -7.0 = -3.5 (should be: -3.5

******************************************
remainders failed
******************************************



 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Ian Rogers added a comment - 28/Nov/07 09:43 AM
It also fails with O0. The failing HIR is:

-13 LABEL0 Frequency: 1.0
-2 EG ir_prologue l0sa(Lspec/benchmarks/_200_check/PepTest;,x,d), l2psd(D,d), l3psd(D,d), l4psd(D,d) =
0 G yieldpoint_prologue
4 double_rem t6sd(D) = l2psd(D,d), l3psd(D,d)
6 double_ifcmp t8sv(GUARD) = l4psd(D,d), t6sd(D), ==F, LABEL2, Probability: 0.5
-1 bbend BB0 (ENTRY)
9 LABEL1 Frequency: 0.5

which is odd as it doesn't give scope for clobbering the x87 values in between the fprem and the compare. Other odd things are that it will pass and the fail in phases. Also it only fails when the test values are -10.5 % -7.0 == -3.5, it passes in the cases where one of the arguments is positive. The IR itself looks sane. It's possible this could be a OPT_SaveVolatile quirk.


Ian Rogers added a comment - 29/Nov/07 03:05 AM
Having stared at the IR for this for quite a while I can't see what's bad. It looks like something unrelated is clobbering a register, but why this happens when the input values are negative I don't know and why it occurs in phases I don't know.

Ian Rogers added a comment - 14/Apr/08 05:13 AM
This bug effects both SSE and x87. Interestingly I'm seeing this error:

checkRemainders: long Failed: 10 % 8 = 2 (should be: 3
Failed: -10 % 8 = -2 (should be: -3

and from the code the test should only be checking the remainder for 7. As the code looks good I'm wondering whether the stack logic is broken.


David Grove added a comment - 24/Apr/08 05:05 PM
The following command line on a prototype-opt image is sufficient to recreate the failure with r14159.

rvm -X:aos:enable_recompilation=false org.jikesrvm.tools.oth.OptTestHarness -oc:O0 -oc:inline=false -method spec.benchmarks._200_check.PepTest checkRemD - -er SpecApplication main - -s100 -m10 -M10 -a _200_check.

The LIR looks right to me, so the bug must be somewhere in the IA32 code gen and/or reg allocation.
It's honestly acting like the ucomisd isn't doing exactly what we think it should, but I guess it's also possible I didn't trace the value through all the stack/XMM/x87 shuffles quite right and the bug is in there somewhere instead.


Ian Rogers added a comment - 24/Apr/08 06:28 PM
My feeling is that the code is right, but we're missing something from the bigger picture. It could be that ucomisd and its x87 equivalent aren't doing what we expect, but why would this be time dependent? What I see is that we pass the test initially and then after a few iterations we fail for a while, then for a few more iterations we pass for a while. I thought this could be SSE state being clobbered, but that wouldn't explain the bug also effecting the x87 code. I'm suspicious of the threading mechanism but iirc there are no yield points between the compare and branch. Given the failure is in such a small region of code it should be within our brain power to fix, but currently I'm some what stumped.

Ian Rogers added a comment - 25/Apr/08 08:47 AM
Tracking through the code there is a definite bug with long literals in the JTOC, we're marking slots that shouldn't be literals as literals.. one example is that we mark the slot with the current number of allocated threads as literal, this hold the value 7 during an oth compilation but when run holds the value 8. Causing the result "10 % 8 = 2 (should be: 3" as the test should have been 10 % 7.

Ian Rogers added a comment - 25/Apr/08 11:37 AM
So using the test program below:

public class test {
static double x = -10.5;
static double y = -7;
static void test(double x, double y) {
if (x % y != -3.5) { System.out.println("fail"); }
}
public static void main(String[] args) {
for (int i =0; i< 10;i++) { test(x,y); }
}
}

we fail for the last 3 iterations. If I copy the value of the ftag register across from the last 3 iterations to the 1st iteration, we fail immediately. The problem is the fld is placing NAN on the stack.


David Grove added a comment - 25/Apr/08 01:00 PM
Ian tracked it down to the fact that in the opt codegen for FPREM we push 2 values, but only pop one. Thus after 7 fprems the next fld gets a stackoverflow exception.

We're optimistic that putting in an ffree to balance the fp stack operations in the implementation of FPREM will take care of the problem. I'm testing a fix...


David Grove added a comment - 25/Apr/08 01:51 PM
Fixed via a combination of r14160 and r14161.