Details
-
Type:
Improvement
-
Status:
Closed
-
Priority:
Major
-
Resolution: Fixed
-
Affects Version/s: 1.3.4
-
Fix Version/s: 1.4
-
Labels:None
-
Number of attachments :
Description
At the moment, the linkcheck plugin uses jtidy and xpath for retreiving all links. IMHO regexps would work much faster/better than jtidy-xpath combination.
The following regexp would be a replacement for the xpath expressions:
<(?>link|a|img|script)[^>]?(?>href|src)\s?=\s*?[\"'](.?)[\"'][^>]?
All tests pass with this regexp and in project ws-jaxme I am getting these results for maven-linkcheck-plugin:clearcache maven-linkcheck-plugin:report-real:
with jtidy/xpath: Total time: 2 minutes 43 seconds
with regexps: Total time: 1 minutes 10 seconds
I am sure some regexp guru can improve the performance of this.
I have a question, though. Are mailto links supposed to count as checkable? IMO no.
PD: Also, IMO the createDocument method from LinkCheck should be on a try finally block.
Issue Links
- is related to
-
MPLINKCHECK-20
StackOverflowError
-
Looks good I'll take a closer look when I have more time