Details
-
Type:
Bug
-
Status:
Resolved
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 6.0.1
-
Fix Version/s: 6.1.0rc2
-
Component/s: None
-
Labels:None
-
Number of attachments :3
Description
We are using Jetty in Hadoop (http://lucene.apache.org/hadoop/) in Embedded mode. It is a framework for doing parallel computations on large (distributed) data sets and the framework runs on clusters of 100s of nodes. The main requirement of having Jetty here is to serve JSP pages (for UI) and static files (for collecting outputs of computations from different nodes and aggregate). So we run Jetty on all nodes (something like 350 nodes) and every node asks for outputs from all other nodes (it is basically all-to-all http communication) and the amount of static data that any node might serve up is in the order of 10 GB.
Till a few days back we were using Jetty 5.1.4 and it was working fine for our requirements. We moved to Jetty6.0.1 (stable) recently and ran into weird problems with static-file-serving. What happens is that Jetty gets really very very slow after serving up a few files (in the order of 100s) and it does not come out of this state at all. When we do a thread dump, we notice that all the Jetty worker threads are trying to do a read from some socket as part of parsing http requests.
"btpool0-2" prio=1 tid=0x081bbd70 nid=0x147d runnable [0x4dbfa000..0x4dbfb130]
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at org.mortbay.io.ByteArrayBuffer.readFrom(ByteArrayBuffer.java:168)
at org.mortbay.io.bio.StreamEndPoint.fill(StreamEndPoint.java:98)
at org.mortbay.jetty.bio.SocketConnector$Connection.fill(SocketConnector.java:181)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:263)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:193)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:339)
at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:208)
at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:475)
All clients start timing out on connects to the http ports and things become really really slow. The weird thing is that no such problem happens with Jetty6 although the http client code is exactly the same and so on. I am also attaching the code that we use to start a Jetty server in embedded mode (the methods there are invoked from another class).
-
Hide
- jetty-6.1-SNAPSHOT.jar
- 21/Nov/06 6:13 AM
- 445 kB
- Greg Wilkins
-
- META-INF/MANIFEST.MF 0.2 kB
- org/mortbay/jetty/webapp/webdefault.xml 21 kB
- org/mortbay/.../TagLibConfiguration.class 7 kB
- org/mortbay/.../webapp/Configuration.class 0.5 kB
- org/mortbay/.../webapp/WebAppContext.class 22 kB
- org/mortbay/.../WebXmlConfiguration.class 26 kB
- org/.../JettyWebXmlConfiguration.class 3 kB
- org/mortbay/.../WebInfConfiguration.class 2 kB
- org/mortbay/.../WebAppClassLoader.class 8 kB
- org/mortbay/jetty/encoding.properties 0.1 kB
- org/mortbay/jetty/mime.properties 4 kB
- org/mortbay/jetty/favicon.ico 1 kB
- org/.../ClientCertAuthenticator.class 3 kB
- org/mortbay/.../security/Authenticator.class 0.4 kB
- org/mortbay/.../security/UserRealm.class 0.7 kB
- org/mortbay/.../Credential$Crypt.class 2 kB
- org/mortbay/.../Credential$MD5.class 3 kB
- org/mortbay/.../security/Credential.class 0.9 kB
- org/mortbay/.../security/ServletSSL.class 0.8 kB
- org/mortbay/.../security/Password.class 4 kB
- org/mortbay/.../BasicAuthenticator.class 3 kB
- org/mortbay/.../ConstraintMapping.class 1 kB
- org/mortbay/.../security/Constraint.class 3 kB
- org/mortbay/.../SecurityHandler$1.class 0.6 kB
- org/.../SecurityHandler$NotChecked.class 0.9 kB
- org/mortbay/.../SecurityHandler$2.class 0.6 kB
- org/mortbay/.../SecurityHandler.class 10 kB
- org/mortbay/.../security/PKCS12Import.class 4 kB
- org/mortbay/.../security/JDBCUserRealm.class 6 kB
- org/mortbay/.../HashUserRealm$User.class 1 kB
-
- StatusHttpServer.java
- 17/Nov/06 5:23 AM
- 8 kB
- Devaraj Das
-
- thread-dump-with-selectorconnector
- 17/Nov/06 6:43 AM
- 64 kB
- Devaraj Das
Activity
That thread dump is the normal place that threads "sleep" if you are using the blocking
connector (which you are).
However, we have had several reports of the static serving not being as it should be, plus
it was rather ugly code anyway.
I have done a moderate refactor of the static resource code and that is currently checked into
both 6.0 and 6.1 branches. I will do a 6.1 release today and if there are no reports of problems,
then I will release 6.0 early next week.
Of course this may not actually fix your issue, but it is a good first step.
Here is a thread dump. In the constructor of the StatusHttpServer class (that I attached earlier), I am using SelectChannelConnector as (in the first few lines of the constructor):
webServer = new org.mortbay.jetty.Server();
//use selectchannelconnector
Connector connector=new SelectChannelConnector();
connector.setPort(port);
webServer.setConnectors(new Connector[]
);
Performance-wise I don't see any difference between this and the bio connector.
Also, this is the client code (for the nio impl of Jetty's connector). Is this the right way to get the response? From the stack trace that appears below, it looks like there is some problem to do with closing of connections.
try {
URLConnection connection = path.openConnection();
InputStream input = connection.getInputStream();
try {
int totalLen = connection.getContentLength();
int readLen;
try {
byte[] buffer = new byte[64 * 1024];
int len = input.read(buffer);
while (totalBytes < totalLen) {
if(len > 0)
}
len = input.read(buffer);
}
} finally
} finally
{ input.close(); }I am getting a lot of exceptions of the form:
2006-11-17 04:49:37,013 ERROR org.mortbay.log: /mapOutput:
java.lang.IllegalStateException: Committed
at org.mortbay.jetty.Response.resetBuffer(Response.java:853)
at org.mortbay.jetty.Response.reset(Response.java:832)
at org.mortbay.jetty.Response.sendError(Response.java:220)
at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:1447)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:689)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:445)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:356)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:226)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:627)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:149)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:141)
at org.mortbay.jetty.Server.handle(Server.java:269)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:430)
at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:687)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:492)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:199)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:339)
at org.mortbay.jetty.nio.HttpChannelEndPoint.run(HttpChannelEndPoint.java:270)
at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:475)
Caused by: java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:125)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294)
at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:163)
at org.mortbay.jetty.nio.HttpChannelEndPoint.flush(HttpChannelEndPoint.java:150)
at org.mortbay.jetty.HttpGenerator.flushBuffers(HttpGenerator.java:789)
... 21 more
Firstly, I can see nothing wrong with your client code. However, I don't think it is necessary for you to
close the input streams yourself - I would leave that to the HTTP client code.
Could you try the 6.1.0pre1 release that I have just made. It contains an improved
resource cache and many improvements in the communication layer.
While it does not address your problems directly, It may fix them and if not, it is best to resolve
your issues on the most up to date code base.
cheers
I tried Hadoop with the release 6.1.0pre1 but unfortunately the problems (exceptions that I reported about earlier) seems to be there even with this version. By the way, here is the servlet code that is used for serving these static files:
public static class MapOutputServlet extends HttpServlet {
public void doGet(HttpServletRequest request,
HttpServletResponse response
) throws ServletException, IOException {
String mapId = request.getParameter("map");
String reduceId = request.getParameter("reduce");
if (mapId == null || reduceId == null)
ServletContext context = getServletContext();
int reduce = Integer.parseInt(reduceId);
byte[] buffer = new byte[64*1024];
OutputStream outStream = response.getOutputStream();
JobConf conf = (JobConf) context.getAttribute("conf");
FileSystem fileSys =
(FileSystem) context.getAttribute("local.file.system");
Path filename = conf.getLocalPath(mapId+"/part-"reduce".out");
response.setContentLength((int) fileSys.getLength(filename));
InputStream inStream = null;
// true iff IOException was caused by attempt to access input
boolean isInputException = true;
try {
inStream = fileSys.open(filename);
try {
int len = inStream.read(buffer);
while (len > 0) {
try
catch (IOException ie)
{ isInputException = false; throw ie; } len = inStream.read(buffer);
}
} finally
} catch (IOException ie) {
TaskTracker tracker =
(TaskTracker) context.getAttribute("task.tracker");
Log log = (Log) context.getAttribute("log");
String errorMsg = ("getMapOutput(" + mapId + "," + reduceId +
") failed :\n"+
StringUtils.stringifyException(ie));
log.warn(errorMsg);
if (isInputException)
response.sendError(HttpServletResponse.SC_GONE, errorMsg);
throw ie;
}
outStream.close();
}
}
Please let me know whether this looks fine.
OK - this explains a lot.
The exception is being thrown because your code tried to response.sendError() when the
original exception was that the output had been closed.
This is normal.
What may or may not be normal is why is the stream being closed?
Sometimes this is normal! but looking at your client code, I don't expect this to happen.
How often does this exception occur? all requests? some requests? every so often?
on all requests from some clients?
is it fatal? ie are you seeing indications on your clients that the complete content was not received?
Why are your writing your own static content? it would be much better to use the default servlet with it's cache and short cuts for writing
data fast. At the very least, I can give you some some code that will let you send file mapped buffers rather than do all that IO yourself.
Interested?
To efficiently write a file, try the following:
File file = ..... // get your file from somewhere
NIOBuffer buffer = new NIOBuffer(file);
((HttpConnection.Output)response.getOutputStream()).sendContent(buffer);
The exception occurs pretty frequently. I don't think it has anything to do with specific clients. I am assuming that the code that you just commented with is the code that I should use for writing the static content as response. Let me see if it solves the problems.
I tried with the things you suggested (NIOBuffer stuff). With a regular socket connector (bio), things became a bit faster but even then the rate of transfer of data was not as good as it was with jetty5.1.4.
However, with the server using SelectChannelConnector strange things happen. The client always gets more data than the actual file size!! Also, I get exceptions that I pasted below. Don't know why...
By the way, we don't have much to do with caching in our use case (assuming that jetty does some optimizations to do with serving files that are requested for more than once). The static files are required to be served just once to a client and that same file will not be requested for by any other client. Did I understand the notions of caching right ?
Here is the exception trace that I see once in a while:
144113 [btpool0-1 - Acceptor0 SelectChannelConnector @ 0.0.0.0:7030] WARN org.mortbay.log - EXCEPTION
org.mortbay.jetty.EofException
at org.mortbay.io.nio.SelectChannelEndPoint.close(SelectChannelEndPoint.java:371)
at org.mortbay.jetty.nio.SelectChannelConnector$ConnectorEndPoint.close(SelectChannelConnector.java:262)
at org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:327)
at org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:73)
at org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:120)
at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:492)
at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:475)
Caused by: java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:590)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
at org.mortbay.io.nio.ChannelEndPoint.close(ChannelEndPoint.java:98)
at org.mortbay.io.nio.SelectChannelEndPoint.close(SelectChannelEndPoint.java:367)
... 6 more
144115 [btpool0-1 - Acceptor0 SelectChannelConnector @ 0.0.0.0:7030] ERROR org.mortbay.log - EXCEPTION
java.nio.channels.CancelledKeyException
at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55)
at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:64)
at org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:386)
at org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:73)
at org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:120)
at org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:492)
at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:475)
> The client always gets more data than the actual file size!!
Just to clarify, I explicitly set the content length of the content before I send the content using NIOBuffer. The client sees the correct content length but always ends up getting more data (in the order of 25k bytes in my setup where we serve files of sizes ~130k). The client code looks like (this same client works fine with the socketconnector server):
int totalLen = connection.getContentLength();
int len = input.read(buffer);
while (len > 0)
At the end, it happens that: totalBytes > totalLen.
Oh my lordy lordy..... found a horrid bug with NIOBuffers!
Is was not clearing the buffer content, so subsequent GET's would send the content twice, three times etc.
It was only saved by browsers getting sick of it and closing the connection.
Because you have your own client - you don't have this saving grace.
I will shortly attach a fixed jar to this issue and would really appreciate it if you could test it!
I have attached the jar. Please try it.
Also here is the code that I am using to server files with:
package org.mortbay.jetty.example;
import java.io.File;
public class RandomFileHandler extends AbstractHandler
{
public void handle(String target, HttpServletRequest request, HttpServletResponse response, int dispatch) throws IOException, ServletException
{
System.err.println("RandomFileHandler");
Request base_request=(request instanceof Request)?(Request)request:HttpConnection.getCurrentConnection().getRequest();
base_request.setHandled(true);
String s=request.getParameter("size");
int size=s==null?1024:Integer.parseInt(s);
File file=File.createTempFile("test",".txt");
FileOutputStream out=new FileOutputStream(file);
// file.deleteOnExit();
long len=size-2;
byte[] data="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ\r\n".getBytes("ISO_8859_1");
while (len>0)
out.write("\r\n".getBytes());
out.close();
System.err.println("Created file "file" of size "+file.length());
if (file.length()!=size)
throw new IllegalStateException("WRONG SIZE? "+file.length());
response.setContentType("text/plain");
// response.setContentLength(size);
response.setStatus(HttpServletResponse.SC_OK);
NIOBuffer buffer=new NIOBuffer(file);
((HttpConnection.Output)response.getOutputStream()).sendContent(buffer);
System.err.println("sent content "+file);
}
public static void main(String[] args) throws Exception
{
Server server=new Server();
Connector connector=new SelectChannelConnector();
connector.setPort(8080);
server.setConnectors(new Connector[]
);
Handler handler=new RandomFileHandler();
server.setHandler(handler);
server.start();
server.join();
}
}
Sorry, the new bugfixed jetty still seems to be having some bugs that shows up in our environment. So what's happening most likely is that the clients gets bad data from jetty servers and this leads to lot of failures in the Hadoop framework later on. I have not got the time to analyze it yet though.
The short term decision of the Hadoop team is to roll back to the earlier version - Jetty5.1.4.
From you comment, does that mean that the static content fix worked, but that there were other issues?
Or is there still trouble with the static content
thanks for helping with these issues and sorry to cause you extra development cycles.
Yeah the fix worked from the point of view that I stopped getting the exceptions. So I am assuming that the content gets delivered. But there are other problems later on (which I have not yet debugged) and these problems don't appear if we run Hadoop with Jetty5.1.4.
I think the specific issue was solved.
please open new JIRAs for any additional issues
>The weird thing is that no such problem happens with Jetty6 although
>the http client code is exactly the same and so on.
Ooops. I meant "no such problem with Jetty5.1.4".