Issue Details (XML | Word | Printable)

Key: RVM-362
Type: Improvement Improvement
Status: Open Open
Priority: Major Major
Assignee: Unassigned
Reporter: Ian Rogers
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
RVM

Sort accumulating operands on to LHS of commutative operations

Created: 23/Jan/08 11:44 AM   Updated: 11/Apr/08 09:46 AM
Component/s: Compiler: Optimizing, Compiler: Optimizing: Instruction Scheduler, Instruction Architecture: Intel
Affects Version/s: None
Fix Version/s: 1000

Time Tracking:
Not Specified


 Description  « Hide
For all SSE rules we play the following game:

r: FLOAT_ADD(r, r)
p.child1.isREGISTERNode() ? 11 : 13
EMIT_INSTRUCTION
SSE2_COP(IA32_ADDSS, P(p), Binary.getResult(P(p)), Binary.getVal1(P(p)), Binary.getVal2(P(p)));

r: FLOAT_ADD(r, r)
p.child2.isREGISTERNode() ? 11 : 13
EMIT_INSTRUCTION
SSE2_COP(IA32_ADDSS, P(p), Binary.getResult(P(p)), Binary.getVal2(P(p)), Binary.getVal1(P(p)));

that is, we make the rule cheaper if we find that the child of the operand on the LHS is truly a register. I believe the rational is to have a register that accumulates the result. We don't play the same game for integer operations, for example:

czr: INT_ADD(r, riv)
13
EMIT_INSTRUCTION
EMIT_Commutative(IA32_ADD, P(p), Binary.getResult(P(p)), Binary.getVal1(P(p)), Binary.getVal2(P(p)));

which if recoded using the SSE style scheme would be:

czr: INT_ADD(r, riv)
p.child1.isREGISTERNode() ? 11 : 13
EMIT_INSTRUCTION
EMIT_Commutative(IA32_ADD, P(p), Binary.getResult(P(p)), Binary.getVal1(P(p)), Binary.getVal2(P(p)));

czr: INT_ADD(r, riv)
p.child2.isREGISTERNode() ? 11 : 13
EMIT_INSTRUCTION
EMIT_Commutative(IA32_ADD, P(p), Binary.getResult(P(p)), Binary.getVal2(P(p)), Binary.getVal1(P(p)));

It strikes me that making these cost methods dynamic isn't a good thing, instead maybe we could pass over the IR making sure that true register operands occur on the LHS, this will reduce the number of rules by half and remove a runtime cost in BURS. If this doesn't make sense then we should consider adding the SSE style isREGISTERNode dynamic costs to integer operations.



 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
David Grove added a comment - 23/Jan/08 12:57 PM
It's likely that this isn't serving a useful purpose anymore. In the old 387 rules, we tried to make it cheaper to accumulate into a register because that let us generate FP stack operations in BURS. Unless there is some similar asymetric nature of the SSE instructions, I think we don't need to have the dynamic rules here.

Caveat: all from memory, so don't assume I'm right.


Ian Rogers added a comment - 23/Jan/08 05:19 PM
Thanks Dave, I think Daniel had some performance figures that supported the dynamic way of doing things for SSE. It's a fairly small change to remove the dynamic cost, we can then see the effect on nightly regressions and re-introduce the dynamic cost if it is useful.