Details

    • Type: Improvement Improvement
    • Status: Closed Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.6
    • Fix Version/s: 1.0
    • Component/s: None
    • Labels:
      None
    • Number of attachments :
      0

      Description

      PHPCPD suffers from limitations (it can't ignore 'use' directives for instance) and the current properties (min-lines=3 and min-tokens=5) of the plugin give too many false-positives.

      To prepare a move to Sonar CPD engine (which will have an extension point for other languages with Sonar 2.14), we decided to use the Java PMD-CPD tool for duplication detection. This removes the need for PHPCPD tool, which will improve the results while making the overall install of the PHP plugin easier (1 step less).

        Activity

        Hide
        Fabrice Bellingard added a comment -

        After spending quite some time on testing several values for the 2 parameters on the Symfony framework, it turns out that the best results I could get are obtained with "min-lines=4" and "min-tokens=15".

        For information :

        • "min-lines=3" was giving too many false-positives with "namespace" and "use" directives ("4" seems to exclude most of them, but the problem is definitely on PHPCPD side that should not take those directives into consideration)
        • "min-tokens=5" was too low and was detecting chunks of code like:
          private function addFormSection(ArrayNodeDefinition $rootNode)
          {
            $rootNode
                   ->children()
                          ->arrayNode('form')
          
        Show
        Fabrice Bellingard added a comment - After spending quite some time on testing several values for the 2 parameters on the Symfony framework, it turns out that the best results I could get are obtained with "min-lines=4" and "min-tokens=15". For information : "min-lines=3" was giving too many false-positives with "namespace" and "use" directives ("4" seems to exclude most of them, but the problem is definitely on PHPCPD side that should not take those directives into consideration) "min-tokens=5" was too low and was detecting chunks of code like: private function addFormSection(ArrayNodeDefinition $rootNode) { $rootNode ->children() ->arrayNode('form')
        Hide
        Fabrice Bellingard added a comment -

        As we decided to get rid of PHPCPD, I rename this issue as the title doesn't make sense any longer.

        Show
        Fabrice Bellingard added a comment - As we decided to get rid of PHPCPD, I rename this issue as the title doesn't make sense any longer.
        Hide
        Fabrice Bellingard added a comment -

        PHPCPD removed on revision 4893.

        Show
        Fabrice Bellingard added a comment - PHPCPD removed on revision 4893.
        Hide
        Evgeny Mandrikov added a comment - - edited

        Fabrice, could you please consider usage of following regular expression in order to support Heredoc and Nowdoc syntax for string quoting :

        <<<(['"]?)(IDENTIFIER)+\1[\s\S]*?\NEWLINE\2
        

        where IDENTIFIER is another regular expression, which conforms naming rules of label in PHP,
        and NEWLINE is something like "(?:\n\r|\r|\n)" to match start of new line - see http://www.php.net/manual/en/language.types.string.php#language.types.string.syntax.heredoc.
        I suppose it will not impact performance, because of usage of reluctant quantifier.

        Show
        Evgeny Mandrikov added a comment - - edited Fabrice, could you please consider usage of following regular expression in order to support Heredoc and Nowdoc syntax for string quoting : <<<(['"]?)(IDENTIFIER)+\1[\s\S]*?\NEWLINE\2 where IDENTIFIER is another regular expression, which conforms naming rules of label in PHP, and NEWLINE is something like "(?:\n\r|\r|\n)" to match start of new line - see http://www.php.net/manual/en/language.types.string.php#language.types.string.syntax.heredoc . I suppose it will not impact performance, because of usage of reluctant quantifier.
        Hide
        Fabrice Bellingard added a comment -

        Done, thanks Evgeny!
        (FYI, your regexp was almost perfect! I just had to remove the back-slash before 'NEWLINE' )

        Show
        Fabrice Bellingard added a comment - Done, thanks Evgeny! (FYI, your regexp was almost perfect! I just had to remove the back-slash before 'NEWLINE' )
        Hide
        Evgeny Mandrikov added a comment -

        You're welcome! And mistakes were predictable as I wrote it from mind

        Show
        Evgeny Mandrikov added a comment - You're welcome! And mistakes were predictable as I wrote it from mind

          People

          • Assignee:
            Fabrice Bellingard
            Reporter:
            Fabrice Bellingard
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: