|
[
Permalink
| « Hide
]
Ian Rogers added a comment - 16/Mar/08 04:36 PM
We wish to complete native threading support for the SoC 2008.
This comment was added to the GSoC project descriptions page by Steve Blackburn:
In addition to a pure 1:1 native threading implementation which eliminates much of the complexity in the current Jikes RVM system relating to blocking system calls, etc., for some Jilkes RVM researchers it might continue to be useful to multiplex M "lightweight" threads above N native threads so as to permit additional flexibility in thread implementations. Although, to some approximation, this represents the current state of affairs in Jikes RVM, the current implementation conflates design decisions that were made when the original M:N threading system was implemented. While the new pure 1:1 mapping of Java threads will be the base implementation, and will dictate the fundamental design decisions of the new threading system (e.g., permitting Java threads to block on I/O), perhaps there is a way to support user-level scheduling for extremely lightweight threads. The idea would be to rebuild user-level scheduling on a foundation that makes OS-level scheduling of native threads the default. Thus, while we address Lightweight threading is provided in java.util.concurrent's executors framework. I don't think there's a need for the RVM to replicate this functionality. Green threading is still required for projects, for example, JNode. Could we describe where/how the thread system is "conflated" ?
I think the ability for an application to participate in scheduling decisions is going to be a significant drive in the "concurrency revolution" and thus it would be a loss to remove the ability to do this from the RVM. Most applications will still run on a general purpose OS and be subject to their MRQ style scheduler which is already showing how inadequate it is for high concurrency apps. There has been some interesting work done on the scheduling policies that incorporate application specific knowledge and can thus dramatically increase performance. Cohort Scheduling [1] uses application specific data to increase locality when scheduling. SRQ scheduling[2] using application specific data (request characteristics) while Capriccio [3] implements a resource aware scheduler to increase throughput by avoiding conflicts between resource management and scheduling policies.
I would expect that this trend is going to increase as hardware previously associated with parallel programming (+ cooperative scheduling policies) becomes platforms of choice for normal servers (traditionally competitive scheduling policies). Just as a taste one of the Sun guys recently proposed that by 2018 that servers could have 128 cores with 1024 hardware threads per core (and many seemed to suggest that this was a very conservative estimate). So while the RVM is not suited to this sort of research atm it would only take a relatively short amount of time (4-6 months?) to get it on the bandwagon and I would hate to see us add impediments to this (unless they were a high cost to other parts of the rvm). [1] http://portal.acm.org/citation.cfm?doid=384197.384222 I can't make much sense of this discussion.
1. The comments I posted to the SoC page were done on behalf of the mentor, Tony Hosking (who didn't have edit privileges). Ian took the comments off the SoC page and pasted them here. As I understand it, the rationale behind Tony's comments was all about how to manage a SoC project, particularly in the context of multiple students wanting to work on it. In a nutshell, AFAIK, Tony was suggesting two complementary projects: a) get 1:1 threading engineered, tested and working, b) generalize threading, a continuation of last year's project. In the text above, he was suggesting that N:M be layered over a fast 1:1 implementation. As it happens there are two highly motivated students who align with a) and b) respectively. To me the idea of having someone working solely on the task of getting 1:1 threading working is very sensible, especially if they have the skills and motivation for that task. Likewise, if someone is particularly interested and motivated in the (non-trivial) abstraction tasks of a generalized threading component that is great. Of course once we had a working proof of concept 1:1 implementation, then the task would be to harmonize it with the abstractions, and I think the two projects would inform each other. Nonetheless, I think it is pragmatic and good development style to let the enthusiast rip ahead with a prototype of 1:1 threading, and having completed that to draw breath and see how it fits into the modular threading system. So, I'm scratching my head a little in trying to understand the ensuing commentary. It seems a bit of a storm in a teacup. Tony's suggestion seemed straightforward and pragmatic and focused on the specifics of getting students productively involved in SoC. 2. Many of us don't need any motivation to understand the desire for various threading models. Most of us are well aware of the increasing pressures of parallelism. Some of us have worked directly on this (I implemented cactus stacks on Jikes RVM with people from MITRE some years ago; they successfully got their air traffic simulation system scaling on Jikes RVM far better than any production VM; we could handle millions of threads. I supervised the Moxie project which included a port to the L4 microkernel---the motivation there was entirely about avoiding duplication of services, most notably scheduling and memory management ). So specifically, I don't understand what Peter was alluding to with his comment: "I would hate to see us add impediments to this". What impediments did you have in mind? I know Tony is traveling to the UK right now, so it may be a day or so before he reads this discussion.... It is quite possible I am missing something and it was more of a drive by comment. Now that there is some context it seems a lot more sensible. It would be VERY useful to have a robust 1-to-1 threading implementation and once the kinks have been worked out it would even make sense to make it the default thread model as that is what is present in most commercial JVMs.
However I still see a lot of value in the green threads implementation - mostly because it allows you to play with different scheduling policies and makes a bunch of other things easier. So I really think that the 1-to-1 and M-to-N should be peers. Currently there is a bunch of code in the RVM that could be stripped out if we only cared about 1-to-1 thread implementations and it would simplify a bunch of code. However I hope that does not occur. The whole "we'll build a M-to-N on top of the 1-to-1" seems very unlikely to me but if someone wants to try it then more power to them As to what I consider an impediment - that would be removing some of the functionality already present to support M-to-N. I also think trying to base a M-to-N off a 1-to-1 would have similar effect but willing to hold judgement Anyhoo - I am no longer using any of that code so I can't really commit to helping so I should just STFU Anyhoo to save myself from making entirely useless comments. In answer to Ians answer "Could we describe where/how the thread system is conflated?" I would say the most obvious thing (that Steve has already pointed out) is "stack management". there is nothing stopping us having multiple different stack management policies in either native or green thread implementations.
Thats just one thing I would consider a "conflation" but I could also say similar things about the way we manage synchronisation primitives. I would have to look at the code again but I remember thinking that there is some TLC needed there. Hi, just to explain that I saw Tony's comments (sorry to have attributed them to Steve) on the wiki and thought that this kind of discussion was best held on the issue tracker. The wiki isn't read by everyone and my object with the project descriptions was that they should give a brief explanation of themselves and then go to the tracker for a meatier debate by the RVM community. Everyone who cares should be watching the issue tracker, I don't imagine everyone has time to keep checking the wiki (at least this has been the result of past discussion on the subject).
I don't agree with the need for 2 projects. A native threading implementation by design just calls through to the OS to handle threading issues. It is considerably simpler than a green threading implementation that manages all its own data structures. A problem in our current model is that we have a VM_Processor and VM_Thread distinction that is possibly heavier than we'd want in a finely tuned native threading implementation. I'd suggest a first instance native implementation just lives with this and implements the native system calls and handles the problem that currently the number of VM_Processors is limited. We can then work on refining the abstraction to be as good as possible for all. I see the finding this best fit a significant challenge. A problem is also the current memory manager interface and OSR code directly manipulate bits of the green thread code, this should be abstracted into the threading interface. Btw: I hope its not been a secret that I want the native threading done this way, it is described in Rahul's dissertation [1] and the native stub that is present in the code base is supposed to allude to this. I've also discussed this with Tony by e-mail. Steve, I've had a number of conversations with lots of different people about the desire to have a cactus stack implementation for the RVM. I imagine your code has bit-rotted. Is there any chance you can post this code onto the issue/patch tracker so that it may serve as an example for other people wishing to undertake cactus stack (and similar) work for the RVM? One last point is on the use of the word conflated, what I hope is that when people see a problem they can describe it in such a way others can really appreciate it - give an example, rather than leave it to guess work. I think Peter has done a good job of interpreting what the word probably meant in the context of the thread system comments. If everything is out in the open then when a change is made everyones design goals can be taken into consideration. If they cannot be taken into consideration then the person making the change can say why on the tracker they did it differently. If it comes to it, JIRA has a voting mechanism where people can give their feedback on the different choices that can be made. [1] ftp://ftp.cs.man.ac.uk/pub/apt/theses/RahulMehta_MSc.pdf IMHO, there are two distinct things going on here.
1. The design challenge of producing a modular threading system. Ian is particularly involved on this. IMHO these immediate objectives are absolutely not in conflict, and both are meaningful. The first is a design and software engineering challenge. My experience of this is that it is generally iterative, and always informed by implementation experience. My experience has also been that these changes have typically been implementation-driven, by which I mean someone has built something which breaks abstractions and causes a major re-think (either the implementation is re-thought or the abstractions are rethought, or often, both). The second is a straightforward engineering challenge. We don't have 1:1 working now. Many of us have ideas of what's in the way, but a lot of the issues are a matter of guesswork and conjecture until we have a bare-bones system up and running which we can measure. So does an extra indirection cost us lots? Build it and measure it. Are our data structures in appropriate? Probably, but experiment with some alternatives in a prototype before making big decisions. etc. etc. So I would hope that we can all see sense in having both objectives carry on harmoniously, in parallel with each other, and that each might be informed of the other. Remember again, that the context in which this current discussion started was the SoC projects, which are by necessity short-to-medium term objectives and therefore relatively modest in scope. So those projects are not talking about (much less, defining) the grand future of Jikes RVM's threading model, but they are (should be) projects some good students could productively engage in which can take us to where we want to be a little way down the track. Just a couple of points. In Rahul's work we'd made progress on the 1:1 threading system using the modular infrastructure. I'm not sure we want a 1:1 thread fork, I'd rather work within the infrastructure we have and not make another branch that everyone will fail to get time to merge into the trunk. Unless we can get some clear undertakings from the people involved (SoC mentor and student) that they will do the necessary leg work to get the system into the trunk - which will mean integrating with the current threading system and not just destroying it.
I don't think you need be concerned, Ian. In this particular SoC instance all players have vested long term interests in the outcomes. No one should be feeling afraid that anyone will either destroy another's work or drop their own contributions on the floor.
One tactically sensible model for getting to a solid 1:1 threading implementation is via a prototype within which discussions of performance and modularity can be concretely evaluated. Whether or not that code per se ever makes is way into the head is immaterial so long as it helps us move quickly and pragmatically toward a working implementation that is in the head. IMHO it is an important tactical/emotional leap to be able to develop code and not be precious about it, be prepared to see it ultimately as a platform upon something better emerges. I think this is the model Tony et al are proposing for the 1:1 SoC work---a solid working prototype. I really still don't get it. My belief is that the prototype system is merely a waste of time. If we want to see native threading implementations there are countless other prototypes we could look at (JamVM, Cacao, my own old Dynamite VM). The particular problem we have with the RVM is that we need to adapt the threading interface to be implementation agnostic. In large part this work has been done by moving all thread queuing code into the green thread package. A lot of things remain, OSR and the memory manager interface, but I fail to see how these things can be accomplished faster by starting a new threading API. I fail to see why this new API would be any more solid than the one that exists. Take our very large success at passing JSR166 TCK tests as proof of this. It seems a simply tortuous thing to do to even contemplate having another threading API.
What I want is to see the assertions that there will be some inherent advantages to the new API backed up by some real examples. Its easy to criticize what exists - the code is there as well as part of a write-up. Without any specific example of what will be achieved by reinventing the wheel I see little point in supporting a course of action to do it. In my experience, prototyping is invaluable. Every collector I have implemented has been done that way. I've done a rough-as-guts proof of concept and then rebuilt from scratch. Some of the collectors in MMTk now have been rebuilt from the ground up multiple times over, each time improving abstractions and engineering choices. The truth is that my intuition is too often a poor guide. Implementation and empirical evaluation is the only way I've found of making well founded engineering decisions. But this is besides the point. Tony et al are offering to create a prototype of 1:1 threading. No one loses. Nothing is destroyed.
Yes, having an implementation agnostic interface is an excellent objective and no one is saying otherwise. The 1:1 threading work is independent and complementary to this objective. It is not intended to speed up this objective and indeed does not address that objective at all. These are essentially independent goals, which is the reason for suggesting two separate projects. The 1:1 project is not an attack or critique of what you've done. It is not addressing the issue of API, it is a proof-of-concept prototype, pure and simple. Whether or not the current API is a good one is something we can all discuss separately, but that is an entirely different matter to the one at hand: identifying SoC projects. To re iterate, I see two SoC projects right now, which are well matched to two enthusiastic and capable students: 1. Modular threading. This project is a design project and a software engineering project and follows on from Rahul's MSc work. Ian is the obvious mentor, and as I understand it there is a good student in the offing. 2. 1:1 threading. This project is an implementation project and will simply develop a rough-cut implementation of 1:1 threading, identifying the Jikes RVM specific problems that stand in the way, and allow some implementation alternatives to be empirically evaluated. I really don't see any reason why you should have concerns, Ian. The 1:1 project is not an attack on your API---the API simply doesn't figure in the project at all. Nonetheless outcomes should help us all develop a great modular implementation (by providing a platform for empirical evaluation of alternatives and tradeoffs eg: "Does having an indirection here slow down green threads or 1:1?", or "What is the actual cost of this data structure, etc etc"). We have two motivated and experienced students. I hope we all feel happy to let the rip ahead and do some exciting work on these two non-conflicting but complimentary projects. I view the 1-1 prototype project as an important step in understanding exactly where m-n threading assumptions have leaked out into the rest of the JVM. I know there are "intrusions" in the adaptive system, in MMTk, and in the JNI and OSR implementations. There may be others that I don't know about. The whole point of the prototyping effort is to identify exactly where code has to be changed to make it work well with native threads. We have to have a complete solution, not just native threads that work unless you want to have the adaptive system perform well or if you disable OSR.
I think it's critical for us to understand how to make all of those subsystems work well in a 1-1 threading implementation so we can understand what new abstractions need to be added to the threading API so we can get modular threading that supports native threading and more esoteric models well. This issue is now Filip Pizlo's GSoC project. Filip needs a JIRA account to assign this issue to him.
Native threading is committed in r15395. I'm keeping this issue open until the sub-tasks are closed.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||