Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ability to baseline violations #3387

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

frankdekker
Copy link

Added the ability to baseline violations. #2543

Features

Create baseline
phpcs src --report-baseline=phpcs.baseline.xml --basepath=/path/to/project/root

  • Writes phpcs.baseline.xml in the current working directory. Pref be the root of the project
  • basepath will be subtracted from the filepaths in the baseline xml.

Use default baseline
Will by default search in the working directory for phpcs.baseline.xml
phpcs src

Use custom location baseline
phpcs src --baselineFile=<path/to/baseline.xml>

Changes

Added classes to read the baseline file and setup data structure.
Added check in File::addMessage to check if the message is baselined.
Added --baselineFile to cli options
Added unit tests for the newly added files.

Documentation

My question, how do I update the documentation to add a section about the baseline feature?

@frankdekker frankdekker marked this pull request as ready for review July 11, 2021 13:21
@sergey-solo
Copy link

Just want to point out that this implementation doesn't baseline individual violations. Instead, it baselines specific sniffs for specific files. Which is different from what one might expect :)

@frankdekker
Copy link
Author

Just want to point out that this implementation doesn't baseline individual violations. Instead, it baselines specific sniffs for specific files. Which is different from what one might expect :)

Correct, will point that out in the documentation. It's a technical choice I took to avoid basing it on line number. Because that will become quickly annoying when you modify files :).

@gsherwood
Copy link
Member

I'm not convinced that this is a useful way of doing a baseline. When entire sniffs (or even sniff codes) are baselined for a file, adding new code with violations to that file wont show any errors if they happen to be the same sort of errors already in the file. For projects that need a baseline, or when introducing a new check to a standard, I think this is a pretty likely case.

I'd likely only accept a baseline feature if it was able to distinguish between violations in existing code and violations in new code. Anything else feels like an ongoing support issue for me as the feature wont work the way people expect and I'll have to defend the design decision for it. I'm absolutely happy to do this, but only if I believe the feature works the way it should.

This is one of those things where we really need a discussion about the solution design before a PR. There was some of that back on #2543, but it's been a while and no real solution was ever outlined. If you want to give this a go, it would be a terrific feature to have and I'd be happy to discuss any ideas you have for the solution over on that other issue.

@frankdekker
Copy link
Author

frankdekker commented Jul 19, 2021

I'm not convinced that this is a useful way of doing a baseline. When entire sniffs (or even sniff codes) are baselined for a file, adding new code with violations to that file wont show any errors if they happen to be the same sort of errors already in the file. For projects that need a baseline, or when introducing a new check to a standard, I think this is a pretty likely case.

I'd likely only accept a baseline feature if it was able to distinguish between violations in existing code and violations in new code. Anything else feels like an ongoing support issue for me as the feature wont work the way people expect and I'll have to defend the design decision for it. I'm absolutely happy to do this, but only if I believe the feature works the way it should.

This is one of those things where we really need a discussion about the solution design before a PR. There was some of that back on #2543, but it's been a while and no real solution was ever outlined. If you want to give this a go, it would be a terrific feature to have and I'd be happy to discuss any ideas you have for the solution over on that other issue.

Completely understandable. This baseline mechanism will indeed suppress new issues in the same file. I'm happy to brainstorm how to make this feature more robust. The problem at hand is that we need a unique way to detect where the violations came from within the file, without explicitly counting tokens or line numbers.

I'll propose an approach in #2543

@frankdekker
Copy link
Author

frankdekker commented Jul 31, 2021

@gsherwood
Improved the baseline calculation based on the approach I suggested in #2543

Now a code signature will be calculated and added to the baseline report file. The code signature is the hash of the code on the line of the violation, and 1 line before and after.

During standard inspection when comparing against the baseline, the same signature will be calculated for encountered violations and compared against the baseline signature. The violation will only be skipped if the code signature is identical to the one in the baseline.

As mentioned, this will prevent a sniff being disabled for the entire file. The sniff will now be specifically disabled for the calculated code signature. Modifications in the file won't break the baseline, unless you edit one of the 3 lines around the baseline. At this point I feel it's quite useful to resolve the violation anyway, as you're editing that close to it.

I would love to hear your thoughts about this solution.

Regards

Frank Dekker

Example baseline:

<?xml version="1.0" encoding="UTF-8"?>
<phpcs-baseline version="3.6.1">
<violation file="src\Baseline\BaselineSet.php" sniff="PEAR.Commenting.FunctionComment.MissingParamName" signature="54accce13bb236dc361cf9d75895c274fce619d5"/>
<violation file="src\Baseline\BaselineSet.php" sniff="PEAR.Commenting.FunctionComment.MissingParamTag" signature="25a91da1dadc60a04a9a6615502ad4aa4fd397c2"/>
<violation file="src\Baseline\BaselineSet.php" sniff="PEAR.Functions.FunctionDeclaration.SpaceBeforeOpenParen" signature="edcfdf132eb46b842349404740ba7b8f224d82e1"/>
</phpcs-baseline>

@frankdekker
Copy link
Author

@gsherwood It's been a while. You had any chance yet to look at the latest proposal with the code signatures?

@Potherca
Copy link

Potherca commented Mar 8, 2022

Being able to have a baseline would be much appreciated.

@gsherwood Is there anything that can be done to move this forward?

@frankdekker
Copy link
Author

frankdekker commented Mar 8, 2022

Being able to have a baseline would be much appreciated.

Is there anything that can be done to move this forward?

@Potherca
At our company we really needed the baseline, so in the meantime we created the following package:
https://github.com/123inkt/php-codesniffer-baseline

Might be an nice alternative for now.

@Housik
Copy link

Housik commented Apr 24, 2022

@gsherwood Hi Greg, would you be please so kind and take a look to Frank solution? It looks like it now mets your requirements about baseline functionality for phpcs. I am one of the many waiting for this feature, it will help us to rewrite large project. Thank you in advance! Martin ;-)

@gsherwood
Copy link
Member

@Housik This isn't something I have time to look at.

@BafS
Copy link

BafS commented Oct 12, 2022

Is there a chance to see this PR merged? It would be really useful for us too, especially when working with big codebases.

@frankdekker
Copy link
Author

Is there a chance to see this PR merged? It would be really useful for us too, especially when working with big codebases.

You can use https://github.com/123inkt/php-codesniffer-baseline in the meantime.

@BafS
Copy link

BafS commented Nov 1, 2022

@frankdekker unfortunately it didn't work well with our CI (because of editing a vendor file, which should be immutable).

I created a new report for that, without having the need to change a vendor, it's hacky but it works:

<?php declare(strict_types=1);

use PHP_CodeSniffer\Files\File;
use PHP_CodeSniffer\Reports\Full;
use PHP_CodeSniffer\Reports\Report;

final class BaselineReport implements Report
{
    public const BASELINE_FILE = __DIR__ . '/baseline.php';
    private Full $fullReport;

    public function __construct()
    {
        $this->fullReport = new Full();
    }

    private function cleanReportFromBaseline(array $baseline, array $report): array
    {
        $filename = $report['filename'];

        if (!isset($baseline[$filename])) {
            return $report;
        }

        $baselineSources = $baseline[$filename];
        $messages = &$report['messages'];

        foreach ($messages as $ka => $messageLine) {
            foreach ($messageLine as $kb => $messagesOff) {
                foreach ($messagesOff as $kc => $messageList) {
                    $source = $messageList['source'] ?? '';
                    if (isset($baselineSources[$source]) && $baselineSources[$source]['occurrences'] > 0) {
                        $type = $messages[$ka][$kb][$kc]['type'];

                        // Decrement the warnings/errors to keep in sync with the removal of messages in the report
                        match ($type) {
                            'WARNING' => $report['warnings']--,
                            'ERROR' => $report['errors']--,
                        };

                        unset($messages[$ka][$kb][$kc]);

                        $baselineSources[$source]['occurrences']--;
                    }
                }
            }
        }

        return $report;
    }

    public function generateFileReport($report, File $phpcsFile, $showSources = false, $width = 80): bool
    {
        $baseline = [];
        if (is_file(self::BASELINE_FILE)) {
            $baseline = require self::BASELINE_FILE;
            if (!is_array($baseline)) {
                $baseline = [];
            }
        }

        $report = $this->cleanReportFromBaseline($baseline, $report);

        return $this->fullReport->generateFileReport($report, $phpcsFile, $showSources, $width);
    }

    public function generate($cachedData, $totalFiles, $totalErrors, $totalWarnings, $totalFixable, $showSources = false, $width = 80, $interactive = false, $toScreen = true): void {
        // `Full` report only uses $cachedData
        $this->fullReport->generate($cachedData, $totalFiles, $totalErrors, $totalWarnings, $totalFixable, $showSources, $width, $interactive, $toScreen);
    }
}

Usable like: ./vendor/bin/phpcs '--report-App\Dev\PHP_CodeSniffer\GenerateBaselineReport'.

And to generate the baseline

<?php declare(strict_types=1);

use PHP_CodeSniffer\Files\File;
use PHP_CodeSniffer\Reports\Report;
use App\Serializer\PhpEncoder;

final class GenerateBaselineReport implements Report
{
    private static array $baseline = [];

    public function generateFileReport($report, File $phpcsFile, $showSources = false, $width = 80): bool
    {
        $filename = $report['filename'];

        /** @var array<string, int> $occurrences */
        $occurrences = [];
        $messages = &$report['messages'];
        foreach ($messages as $messageLine) {
            foreach ($messageLine as $messagesOff) {
                foreach ($messagesOff as $messageList) {
                    $occurrences[$messageList['source']] ??= 0;
                    $occurrences[$messageList['source']]++;
                }
            }
        }

        foreach ($occurrences as $source => $occurrence) {
            self::$baseline[$filename][$source]['occurrences'] = $occurrence;
        }

        return true;
    }

    public function generate(
        $cachedData,
        $totalFiles,
        $totalErrors,
        $totalWarnings,
        $totalFixable,
        $showSources = false,
        $width = 80,
        $interactive = false,
        $toScreen = true,
    ): void {
        global $argv;

        $phpEncoder = new PhpEncoder();
        $out = $phpEncoder->encode(self::$baseline, PhpEncoder::FORMAT, [
            'strict' => true,
        ]);

        $inputString = implode(' ', $argv);
        $out = str_replace('return ', "// Generated with: $inputString\nreturn ", $out);

        echo $out;
    }
}

Usable like: ./vendor/bin/phpcs --report-App\\Dev\\PHP_CodeSniffer\\GenerateBaselineReport --sniffs=SlevomatCodingStandard.TypeHints.ReturnTypeHint,SlevomatCodingStandard.TypeHints.DeclareStrictTypes > app/Dev/baseline.php

@peterjaap
Copy link

peterjaap commented Nov 20, 2022

@frankdekker it would be great to also have notices for false positives. See my explanation here; 123inkt/php-codesniffer-baseline#7

PHPStan also does this, it's called "unmatched ignored errors", see https://phpstan.org/user-guide/baseline#generating-the-baseline

@gsherwood what can we do for you to move this PR forward? Maybe @jrfnl can take a look since for some reason primarily Dutch people are active in this PR? 😃

@peterjaap
Copy link

@BafS I wanted to give your solution a whirl, but I'm missing the PhpEncoder dep. Judging from the method signature, it's not https://github.com/Riimu/Kit-PHPEncoder?

@BafS
Copy link

BafS commented Nov 20, 2022

@peterjaap right, it's from Symfony\Component\Serializer\Encoder\EncoderInterface but with my own implementation. I edited my answer to use the native var_export function.

@forrest79
Copy link

Hi to all, thanks for this PR and the discussion in it. Like you, we also need a baseline and better ability to ignore valid errors. I look over all your solutions and prepared a package, that works in a similar way as PHPStan (also with outdated info). There must still be some magic and hacking, but it works without hard changing files in the vendor. We're using this for a couple of weeks in our production CI and everything looks good. You can find it here https://github.com/forrest79/PHPCS-Ignores. I will be glad for your comment and improvement. Still, I'm hoping, that PHPCS adds something like this as a new feature :-)

@frankdekker
Copy link
Author

Hi to all, thanks for this PR and the discussion in it. Like you, we also need a baseline and better ability to ignore valid errors. I look over all your solutions and prepared a package, that works in a similar way as PHPStan (also with outdated info). There must still be some magic and hacking, but it works without hard changing files in the vendor. We're using this for a couple of weeks in our production CI and everything looks good. You can find it here https://github.com/forrest79/PHPCS-Ignores. I will be glad for your comment and improvement. Still, I'm hoping, that PHPCS adds something like this as a new feature :-)

Hihi, also nice trick to hook into PHPCS reporting. And same here, still hoping PHPCS comes with a build-in implementation :).

@Morgy93
Copy link

Morgy93 commented Sep 6, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants