Details
-
Type:
Improvement
-
Status:
Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: 3.2
-
Component/s: Compiler: Optimizing
-
Labels:None
-
Number of attachments :
Description
We don't allow copy propagation of physical registers, however, in the case of the processor register this appears overly cautious. In the current generational putfield write barrier we get:
t1 = PR // define t1 to PR
if ....
return // likely
else
... = t1 // unlikely - use of t1
as t1 is alive in the slow path the definition must occur at the head of the method, so we copy PR redundantly for it only to be used if we get into the write barrier slow path. We should just copy propagate PR and save the allocation of t1.
Similarly the test at the head of the put field write barrier is often
...
t2 = l1 + constant_offset
if t2 < constant_start_of_nursery
...
which can be folded in BURS to:
if l1 < constant_start_of_nursery - constant_offset
this is already performed in expression folding, but currently disabled.
copy prop of physical registers is disabled because you can't copy prop the value of PR across a potential thread switch point.
we "discovered" this the hard way in 2000 when we had a low probability bug that cost about 2 person months to find.....if a thread switch actually gets taken, then the value of PR has to change to be re-read otherwise the thread ends up using the "old" value of PR and you get two threads accessing thread-local data structures hanging off the PR. This is a real nasty low probability bug to find....
I think it would be much safer to not try to optimize these and write the barrier code such that PR is explicitly referenced each time it is needed (avoid the assignment of PR to t1).