Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(script): Verify the existence of checker config doc_url pages and find appropriate older releases for gone (removed, dealpha, etc.) checkers #4207

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

whisperity
Copy link
Member

@whisperity whisperity commented Apr 2, 2024

The checker label configuration files most often contain a documentation page link that we suggest to the user when viewing the details of a report. These JSON files are always hard-baked into a released package, and the server serves information based on what is available in the deployed image. As all of these links point to external resources, these links are very susceptible to link rot.

For example, suppose that an analysis was stored with the alpha.Foo checker, with the URL pointing to .../alpha/Foo.html. Once the underlying analyser's documentation changes (usually for two reasons: improving the checker and removing it from alpha, or the checker becoming completely removed from upstream!), this link is now dead. Newer reports stored with core.Foo (.../core/Foo.html) will point to a proper documentation, but re-routing alpha.Foo's documentation page to core.Foo's would be an invalid action, as the behaviour of the checker might have changed meanwhile, rendering the contents of the new document inapplicable to the old report! In addition, nothing prevents the user from running an older analyser with/through a newer CodeChecker package, and uploading new results from the alpha. version even when after core. analyser's release.

This patch introduces an opt-in tool which reads the configuration files and verifies whether the URL is available to a hypothetical user. If not, it attempts to employ a heuristic pipeline to attempt a URL that corresponds to the checker with the currently dead link, first by fixing the typos in the URL, and if that is still unsuccessful, trying the documentation sites of older releases. For now, this fixing logic is only implemented for the LLVM-based analysers, Clang SA and Clang-Tidy, as implementing it requires an accurate understanding of the documentation structure of the specific analyser.

@whisperity whisperity added clang sa 🐉 The Clang Static Analyzer is a source code analysis tool that finds bugs in C-family programs. clang-tidy 🐉 clang-tidy is a clang-based C++ “linter” tool. config ⚙️ labels Apr 2, 2024
@whisperity whisperity added this to the release 6.25.0 milestone Apr 2, 2024
@whisperity whisperity changed the title chore/config/documentation-urls-for-gone-checkers feat(script): Verify the existence of checker config doc_url pages and find appropriate older releases for gone (removed, dealpha, etc.) checkers Apr 9, 2024
@whisperity whisperity force-pushed the chore/config/documentation-urls-for-gone-checkers branch from b0ceb23 to fafc919 Compare April 12, 2024 16:53
@whisperity whisperity force-pushed the chore/config/documentation-urls-for-gone-checkers branch 2 times, most recently from 7964d06 to 7e479e5 Compare April 12, 2024 18:56
@whisperity whisperity marked this pull request as ready for review April 12, 2024 18:58
@whisperity whisperity force-pushed the chore/config/documentation-urls-for-gone-checkers branch from 7e479e5 to c9e798c Compare April 18, 2024 08:25
scripts/labels/label_tool/checker_labels.py Show resolved Hide resolved
scripts/labels/label_tool/checker_labels.py Outdated Show resolved Hide resolved
@whisperity
Copy link
Member Author

#4175 was merged so the multiprocessing library loader code has to be altered, as there are no tests in this script that shows that it won't work if it is merged to the current (post-#4175) master...

@whisperity whisperity marked this pull request as draft April 25, 2024 12:16
…dealpha)

The checker label configuration at `/config/labels/analyzers` most often
contains a `doc_url` entry which points to the documentation URL of the
checker, as shown in the UI.
When the user clicks this, the browser redirects them to this page,
however, these external links are very susceptible to link rot,
especially when analysers entirely decomission checkers (e.g.,
`clang-tidy/cert-dcl21-cpp`) or checkers change name during a
dealphafication (e.g., `alpha.cplusplus.EnumCastOutOfRange` ->
`optin.core.EnumCastOutOfRange`).
In these cases, older analysis results stored with the older (or still
extant) check will have a `doc_url` that points to nowhere in the
upstream.
In addition, there were several identified cases where the links were
recognised as broken (both by this tool and by an actual browser) but
the checker was still extant, simply because of a typo:
`cplusplus.PlacementNew`, `#wdeprecated-deprecated-coroutine` (instead
of `#wdeprecated-coroutine`), `#wclang-diagnostic-unsafe-buffer-usage`
(instead of `#wunsafe-buffer-usage`).

This patch adds an opt-in, developer-only tool under `/scripts/labels`,
which automatically checks (by the way of HTTP requests and HTML DOM
scraping) whether the existing URLs still point to alive links, and
reports this status.
If there is analyser-specific additional knowledge (e.g., ClangSA and
Clang-Tidy is implemented as such as of now), it uses additional
heuristics (most of which is available through reusable library
components for future development!) to figure out a fixed version of the
`doc_url` by normalising `#anchors` to fix typos, and looking up earlier
releases in which the checked under verification was still extant.
@whisperity whisperity force-pushed the chore/config/documentation-urls-for-gone-checkers branch from c9e798c to 992eeef Compare April 30, 2024 11:00
@whisperity whisperity marked this pull request as ready for review April 30, 2024 11:01
@whisperity whisperity requested a review from bruntib April 30, 2024 11:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang sa 🐉 The Clang Static Analyzer is a source code analysis tool that finds bugs in C-family programs. clang-tidy 🐉 clang-tidy is a clang-based C++ “linter” tool. config ⚙️ enhancement 🌟
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants