Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore specific warnings -- Am I on the right track? #788

Closed
wants to merge 10 commits into from

Conversation

nodet
Copy link
Contributor

@nodet nodet commented Jan 19, 2024

I've been working on #782. The code as I have it now fits my needs. Before I invest more time in it to refine and document the behaviour, I would appreciate a confirmation that I'm on the right track.
Could one of the maintainers look at 12e47c9...b928c0e and let me know how this looks like to them?

I would then keep working on the topics mentioned in #782 (comment). Please let me know if I'm missing something.

@cjmayo
Copy link
Contributor

cjmayo commented Jan 19, 2024

Happy to see this. Extending the idea of multiline ignoreerrors to ignorewarnings seems the way to go for a general solution. I had hoped an extra config entry wouldn't be needed and ignorewarnings could be used with multiline as well. But I hadn't thought much about it, maybe that isn't even a good idea, and if it isn't obvious then it probably isn't worth implementing it. Perhaps it would even be hard to make sense of if you had some traditional ignorewarnings and some url ones.

Aliases for warnings is probably a step too far, if I understand the thought - there are multiple codes for a redirect anyway? Not that I have thought about it before, maybe I could be convinced.

As of 10.4 it is possible to pass the config on the command line:
linkchecker --config <(printf '%s\n' '[filtering]' 'ignorewarnings=http-redirected')

see #767. Maybe there are some other scenarios.

It might be best getting this one to a conclusion and then thinking about any further features.

Great to see tests in here, as you say only docs to go - I understand leaving those to last.

Off the top of my head @need_network was really need Internet, as some of those files use example.com. But I haven't looked at that for a while.

On a quick look I haven't spotted anything scary (although I reserve that possibility after a proper look!) looks like you have thought it through. Let's see if the tests pass now and when you feel the PR is complete do say - I do find the project commit history useful, personally I prefer it to tell the story of the functionality not the development history, and I like small commits if they make it easier to understand. Feel free to squash and rebase up to the point you declare it ready.

@cjmayo
Copy link
Contributor

cjmayo commented Jan 20, 2024

Ah, 302 probably wasn't meant as an alias but an optional MESSAGE_REGEX like ignoreerrors.

@Kristinita
Copy link

Type: Opinion 💬

I think that while it’s better to have the ignorewarningsforurls option than not to have it, ignoring redirects isn’t the best idea.

ignorewarningsforurls will prevent linkchecker users from seeing unwanted redirects in the future. For example, redirects to spam sites: https://examplesite.com/examplepagehttps://spamsite.com. Or, for example, redirects to the main page of the site, which happens when the URL is removed and the site redirects requests to the index.html: https://examplesite.com/examplepagehttps://examplesite.com.

In my opinion, it’s better to control redirects rather than ignore them. It would be nice to allow linkchecker users to specify regular expressions for expected (allowed) redirects in the linkcheckerrc file. I provided the details of desired behavior in issue #789.

Thanks.

@cjmayo
Copy link
Contributor

cjmayo commented Jan 22, 2024

Thanks for clarifying. ignorewarningsforurls is of course a generic solution that can be used with other warnings. But clearly room for further investigation & coding for anyone who is interested.

@nodet
Copy link
Contributor Author

nodet commented Jan 23, 2024

Thanks for the comments.

302 probably wasn't meant as an alias but an optional MESSAGE_REGEX like ignoreerrors

The question is whether the configuration for this feature should allow (as is currently):

ignorewarningsforurls=
  ^https://example.com/redirect http-redirected
  ^https://example.com/another

or if it should instead require:

ignorewarningsforurls=
  ^https://example.com/redirect 302
  ^https://example.com/another

I think I prefer the former. I find it more explicit, for me who is not well versed in web coding. In addition, it required less work because 'http-redirected' is what tag is in add_warning. But I wondered whether consistency with ignoreerrors was more important.

@nodet
Copy link
Contributor Author

nodet commented Jan 23, 2024

linkchecker --config <(printf '%s\n' '[filtering]' 'ignorewarnings=http-redirected')

Good to know (I didn't...). Thanks.
Given that a configuration file suits my needs better than command line arguments, I won't work on that one.

@cjmayo
Copy link
Contributor

cjmayo commented Jan 24, 2024

I think I prefer the former. I find it more explicit, for me who is not well versed in web coding. In addition, it required less work because 'http-redirected' is what tag is in add_warning. But I wondered whether consistency with ignoreerrors was more important.

I see it as consistency with both ignorewarnings and ignoreerrors

So as coded at the moment the documentation would start with:

ignorewarningsforurls=URL_REGEX NAME (MULTILINE)

taking a bit from both of them.

@nodet
Copy link
Contributor Author

nodet commented Feb 3, 2024

Will open another PR with the doc and some history simplifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants