New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid copyright not detected #3659
Comments
I'm taking this one @pombredanne |
@pombredanne If I'm right, I've just to make changes as you suggest or this require anything else, if anything else required you can tell me I'm beginner and want to make good contributions. |
@CaptainTron You could start by crafting the unit tests that fail for now scancode-toolkit/src/cluecode/copyrights.py Line 3987 in 79aae34
|
@pombredanne can you elaborate a bit more, I'm not getting as of now, as what unit test to look for, I've gone through that line and doc, still I'm confused!. |
[C] The Regents of the University of Michigan and Merit Network, Inc. 1992, 1993, 1994, 1995 All Rights Reserved
is rare and not detected because[C]
is not a valid copyright "sign"We have a few other cases in https://github.com/search?q="Copyright+[C]"&type=code
The only sane resolution I can think of is to normalize these warts in text preparation:
[C] The Regents of the University
by(C) The Regents of the University
Copyright [c]
byCopyright (c)
in all character cases.[C]
cannot be/is not a valid sign and this would otherwise trigger a badzillion of false positives as seen in https://github.com/search?q="[C]"&type=code (actually only millions, not badzillions)The text was updated successfully, but these errors were encountered: