Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False positives: "can use", "via ssh" #28

Open
quackduck opened this issue Apr 17, 2022 · 7 comments
Open

False positives: "can use", "via ssh" #28

quackduck opened this issue Apr 17, 2022 · 7 comments
Labels
bug Something isn't working

Comments

@quackduck
Copy link

Of course, I could add these to the false positives list, but maybe there's a better, more general way to tackle these.

@TwiN
Copy link
Owner

TwiN commented Apr 17, 2022

Yeah, adding canuse and viassh to the default list of false positives is probably going to be the easiest way to tackle this.

@TwiN TwiN added the bug Something isn't working label Apr 17, 2022
@quackduck
Copy link
Author

True. My issue was more about whether there could be a way to detect these innocent legitimate two word messages.

@TwiN
Copy link
Owner

TwiN commented Apr 22, 2022

Yeah there isn't really one besides using the false positives list.

You could create a PR to add them to the default false positives if you'd like:

var DefaultFalsePositives = []string{

@finnbear
Copy link
Contributor

finnbear commented May 10, 2022

Yeah there isn't really one besides using the false positives list.

It would take some work on your end, but you could process my comprehensive false positives list in a code generator, as follows:

  • Read file line by line
  • Feed each line into goaway
  • If it detects something, add it as a false positive (or tell me if something bad ended up in the list 😉)

If you're wondering, I generated it using a dictionary search of words and pairs of words, combined with my own additions.

The downside is that my filter operates a bit differently (has some interesting heuristics), and doesn't require certain false positives to be explicitly included in its list. In these cases, you would still need to maintain your own false positive list and/or replicate the dictionary search.

@quackduck
Copy link
Author

quackduck commented May 10, 2022

Thanks for commenting! @TwiN this could also be a good place to use go:embed (then decode on init() possibly)

(I’m curious: how did you find this thread @finnbear?)

@finnbear
Copy link
Contributor

this could also be a good place to use go:embed (then decode on init() possibly)

True! The downside here is that you would be including the entire list, when only a subset is relevant to goaway. A build step/code generator is more work, but could avoid wasting space in the compiled binary by filtering in advance.

(I’m curious: how did you find this thread @finnbear?)

I check in on this repository every once in a while, as it was and is a great source of inspiration for my profanity filters 😃

@quackduck
Copy link
Author

True! The downside here is that you would be including the entire list, when only a subset is relevant to goaway.

We could trim the file once as needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants