Issue Details (XML | Word | Printable)

Key: RVM-366
Type: Improvement Improvement
Status: Open Open
Priority: Minor Minor
Assignee: Unassigned
Reporter: Peter Donald
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
RVM

Support a yield point implementation that uses polling page

Created: 02/Feb/08 10:59 PM   Updated: 11/Apr/08 09:46 AM
Component/s: Runtime: Threads and Concurrency
Affects Version/s: None
Fix Version/s: 1000

Time Tracking:
Not Specified


 Description  « Hide
The implementation of yield points can create a "polling" page. Every yield point instruction is expanded into a memory read from the page. When all the threads need to suspended for GC the page is unmaped. Eventually all the threads page fault trying to read from unmapped page and the handler suspends the thread.

This may be of use in the green thread implementation but will be a relative high performance solution for native threads.



 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Peter Donald added a comment - 02/Feb/08 11:00 PM
From Ian:

so to get an idea of how we deal with the same situation look at:

http://rvm.codehaus.org/docs/api/org/jikesrvm/compilers/opt/ia32/OPT_FinalMIRExpansion-source.html#line.537

so rather than an opcode with one reg/mem operand we do compare memory
and constant then a branch. The important case (I believe) will be in
loop back edges. Using the trick saves you one operation per loop, it
also avoids having too many branches within 16bytes of each other
(Intel's branch predictor can only handle 2 iirc). On Intel we don't set
to take a yield point with a timer tick, so CBS and GC are the two
places to modify the takeYieldpoint value in VM_Processor. It seems
reasonable that we could refactor the code in green threads to map/unmap
a page rather than use takeYieldpoint, but we'd probably want both
implementations for flexibility's sake. One final thing to note is that
the tweak could use a non-temporal load on Intel to avoid any cache
pollution problems.


David Grove added a comment - 03/Feb/08 05:22 AM
With the adaptive system, we take this yieldpoints more frequently than I suspect other JVMs do to support online profiling (ie, not just to initiate a GC cycle). It's an interesting idea, but I'd be a little surprised if this actually was going to be a performance win for us. Nothing wrong with prototyping it if someone is interested, just observing that perhaps one should do a little measurement of how frequently we take yieldpoints and then do a back-of-the-envelope calculation of trap costs vs. saving an L1 cache hitting load and a conditional branch at every yieldpoint.