Details
-
Type:
Improvement
-
Status:
Closed
-
Priority:
Minor
-
Resolution: Fixed
-
Affects Version/s: 1.3.2
-
Fix Version/s: 1.3.3
-
Labels:None
-
Environment:N/A
-
Number of attachments :
Description
This results in an unacceptably long parse time for large
html files, as the file seems to be read one byte at a time,
incurring a native call each time.
On my machine, the disk utilization is very high as a result
of this. I discovered this because my checkstyle-report.html
is 40 megs big. the workaround is simple (do not include the
linkcheck report in the site generation), however everyone
will benefit from a faster parsing of big html files. So i
submitted the issue.
Index: src/main/org/apache/maven/linkcheck/FileToCheck.java
===================================================================
RCS file: /home/cvspublic/maven-plugins/linkcheck/src/main/org/apache/maven/linkcheck/FileToCheck.java,v
retrieving revision 1.17
diff -u -r1.17 FileToCheck.java
— src/main/org/apache/maven/linkcheck/FileToCheck.java 1 Aug 2004 22:23:33 -0000 1.17
+++ src/main/org/apache/maven/linkcheck/FileToCheck.java 17 Aug 2004 13:26:32 -0000
@@ -17,6 +17,7 @@
- ====================================================================
*/
+import java.io.BufferedInputStream;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
@@ -146,13 +147,13 @@
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PrintWriter errOut = new PrintWriter(baos);
- FileInputStream in = new FileInputStream(fileToCheck);
+ BufferedInputStream bin = new BufferedInputStream(new FileInputStream(fileToCheck));
try { Tidy tidy = getTidy(); tidy.setErrout(errOut); LOG.debug("Processing:" + fileToCheck); - org.w3c.dom.Document domDocument = tidy.parseDOM(in, null); + org.w3c.dom.Document domDocument = tidy.parseDOM(bin, null); // now read a dom4j document from // JTidy's W3C DOM object @@ -165,7 +166,7 @@ }finally
{ - close(in); + close(bin); close(baos); }}
patch file.