Maven 1.x LinkCheck Plugin
  1. Maven 1.x LinkCheck Plugin
  2. MPLINKCHECK-15

[PATCH] FileToCheck does not use BufferedInputStream

    Details

    • Type: Improvement Improvement
    • Status: Closed Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.3.2
    • Fix Version/s: 1.3.3
    • Labels:
      None
    • Environment:
      N/A
    • Number of attachments :
      1

      Description

      This results in an unacceptably long parse time for large
      html files, as the file seems to be read one byte at a time,
      incurring a native call each time.
      On my machine, the disk utilization is very high as a result
      of this. I discovered this because my checkstyle-report.html
      is 40 megs big. the workaround is simple (do not include the
      linkcheck report in the site generation), however everyone
      will benefit from a faster parsing of big html files. So i
      submitted the issue.

      Index: src/main/org/apache/maven/linkcheck/FileToCheck.java
      ===================================================================
      RCS file: /home/cvspublic/maven-plugins/linkcheck/src/main/org/apache/maven/linkcheck/FileToCheck.java,v
      retrieving revision 1.17
      diff -u -r1.17 FileToCheck.java
      — src/main/org/apache/maven/linkcheck/FileToCheck.java 1 Aug 2004 22:23:33 -0000 1.17
      +++ src/main/org/apache/maven/linkcheck/FileToCheck.java 17 Aug 2004 13:26:32 -0000
      @@ -17,6 +17,7 @@

      • ====================================================================
        */

      +import java.io.BufferedInputStream;
      import java.io.ByteArrayOutputStream;
      import java.io.File;
      import java.io.FileInputStream;
      @@ -146,13 +147,13 @@
      {
      ByteArrayOutputStream baos = new ByteArrayOutputStream();
      PrintWriter errOut = new PrintWriter(baos);

      • FileInputStream in = new FileInputStream(fileToCheck);
        + BufferedInputStream bin = new BufferedInputStream(new FileInputStream(fileToCheck));
        try { Tidy tidy = getTidy(); tidy.setErrout(errOut); LOG.debug("Processing:" + fileToCheck); - org.w3c.dom.Document domDocument = tidy.parseDOM(in, null); + org.w3c.dom.Document domDocument = tidy.parseDOM(bin, null); // now read a dom4j document from // JTidy's W3C DOM object @@ -165,7 +166,7 @@ }

        finally

        { - close(in); + close(bin); close(baos); }

        }

      1. patch.txt
        2 kB
        Stephane Mikaty

        Activity

        Hide
        Stephane Mikaty added a comment -

        patch file.

        Show
        Stephane Mikaty added a comment - patch file.
        Hide
        Carlos Sanchez added a comment -

        Fixed. Thanks

        Show
        Carlos Sanchez added a comment - Fixed. Thanks

          People

          • Assignee:
            Carlos Sanchez
            Reporter:
            Stephane Mikaty
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 10 minutes
              10m
              Remaining:
              Remaining Estimate - 10 minutes
              10m
              Logged:
              Time Spent - Not Specified
              Not Specified