jira.codehaus.org

  • Log In Access more options
    • Online Help
    • Keyboard Shortcuts
    • About JIRA
    • JIRA Credits
    • What?s New
  • Dashboards Access more options (Alt+d)
  • Projects Access more options (Alt+p)
  • Issues Access more options (Alt+i)
Signup
Sonar Plugins
  • Sonar Plugins
  • SONARPLUGINS-1596

Remove the need of PHPCPD

  • Log In
  • Views
    • XML
    • Word
    • Printable

Details

  • Type: Improvement Improvement
  • Status: Closed Closed
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: PHP-0.6
  • Fix Version/s: PHP-1.0
  • Component/s: PHP
  • Labels:
    None
  • Number of attachments :
    0

Description

PHPCPD suffers from limitations (it can't ignore 'use' directives for instance) and the current properties (min-lines=3 and min-tokens=5) of the plugin give too many false-positives.

To prepare a move to Sonar CPD engine (which will have an extension point for other languages with Sonar 2.14), we decided to use the Java PMD-CPD tool for duplication detection. This removes the need for PHPCPD tool, which will improve the results while making the overall install of the PHP plugin easier (1 step less).

Activity

Ascending order - Click to sort in descending order
  • All
  • Comments
  • Work Log
  • History
  • Activity
Hide
Permalink
Fabrice Bellingard added a comment - 30/Dec/11 7:10 AM

After spending quite some time on testing several values for the 2 parameters on the Symfony framework, it turns out that the best results I could get are obtained with "min-lines=4" and "min-tokens=15".

For information :

  • "min-lines=3" was giving too many false-positives with "namespace" and "use" directives ("4" seems to exclude most of them, but the problem is definitely on PHPCPD side that should not take those directives into consideration)
  • "min-tokens=5" was too low and was detecting chunks of code like:
    private function addFormSection(ArrayNodeDefinition $rootNode)
    {
      $rootNode
             ->children()
                    ->arrayNode('form')
    
Show
Fabrice Bellingard added a comment - 30/Dec/11 7:10 AM After spending quite some time on testing several values for the 2 parameters on the Symfony framework, it turns out that the best results I could get are obtained with "min-lines=4" and "min-tokens=15". For information : "min-lines=3" was giving too many false-positives with "namespace" and "use" directives ("4" seems to exclude most of them, but the problem is definitely on PHPCPD side that should not take those directives into consideration) "min-tokens=5" was too low and was detecting chunks of code like: private function addFormSection(ArrayNodeDefinition $rootNode) { $rootNode ->children() ->arrayNode('form')
Hide
Permalink
Fabrice Bellingard added a comment - 02/Jan/12 10:21 AM

As we decided to get rid of PHPCPD, I rename this issue as the title doesn't make sense any longer.

Show
Fabrice Bellingard added a comment - 02/Jan/12 10:21 AM As we decided to get rid of PHPCPD, I rename this issue as the title doesn't make sense any longer.
Hide
Permalink
Fabrice Bellingard added a comment - 02/Jan/12 10:24 AM

PHPCPD removed on revision 4893.

Show
Fabrice Bellingard added a comment - 02/Jan/12 10:24 AM PHPCPD removed on revision 4893.
Hide
Permalink
Evgeny Mandrikov added a comment - 04/Jan/12 5:56 AM - edited

Fabrice, could you please consider usage of following regular expression in order to support Heredoc and Nowdoc syntax for string quoting :

<<<(['"]?)(IDENTIFIER)+\1[\s\S]*?\NEWLINE\2

where IDENTIFIER is another regular expression, which conforms naming rules of label in PHP,
and NEWLINE is something like "(?:\n\r|\r|\n)" to match start of new line - see http://www.php.net/manual/en/language.types.string.php#language.types.string.syntax.heredoc.
I suppose it will not impact performance, because of usage of reluctant quantifier.

Show
Evgeny Mandrikov added a comment - 04/Jan/12 5:56 AM - edited Fabrice, could you please consider usage of following regular expression in order to support Heredoc and Nowdoc syntax for string quoting : <<<(['"]?)(IDENTIFIER)+\1[\s\S]*?\NEWLINE\2 where IDENTIFIER is another regular expression, which conforms naming rules of label in PHP, and NEWLINE is something like "(?:\n\r|\r|\n)" to match start of new line - see http://www.php.net/manual/en/language.types.string.php#language.types.string.syntax.heredoc . I suppose it will not impact performance, because of usage of reluctant quantifier.
Hide
Permalink
Fabrice Bellingard added a comment - 04/Jan/12 10:13 AM

Done, thanks Evgeny!
(FYI, your regexp was almost perfect! I just had to remove the back-slash before 'NEWLINE' )

Show
Fabrice Bellingard added a comment - 04/Jan/12 10:13 AM Done, thanks Evgeny! (FYI, your regexp was almost perfect! I just had to remove the back-slash before 'NEWLINE' )
Hide
Permalink
Evgeny Mandrikov added a comment - 04/Jan/12 10:24 AM

You're welcome! And mistakes were predictable as I wrote it from mind

Show
Evgeny Mandrikov added a comment - 04/Jan/12 10:24 AM You're welcome! And mistakes were predictable as I wrote it from mind

People

  • Assignee:
    Fabrice Bellingard
    Reporter:
    Fabrice Bellingard
Vote (0)
Watch (0)

Dates

  • Created:
    30/Dec/11 4:02 AM
    Updated:
    16/Jan/12 5:22 AM
    Resolved:
    04/Jan/12 10:26 AM
  • Atlassian JIRA (v5.2.7#850-sha1:b2af0c8)
  • Report a problem
  • Powered by a free Atlassian JIRA open source license for Codehaus. Try JIRA - bug tracking software for your team.