New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add new file categorization plugin #2945 #2958
base: develop
Are you sure you want to change the base?
Conversation
This PR is related to issue #2945. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! See my nits inline.
Also the test file jarDependImg.png
likely comes from http://svn.apache.org/repos/asf/xmlgraphics/batik/branches/webapi/documentation-sources/resources/images/jarDependImg.png under an Apache license that we did not document and is also too big for a test file. Can you instead create an extra small PNG that is not from a third-party?
from commoncode.datautils import String | ||
from plugincode.post_scan import PostScanPlugin | ||
from plugincode.post_scan import post_scan_impl | ||
from commoncode.cliutils import PluggableCommandLineOption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please sort your imports
from commoncode.cliutils import POST_SCAN_GROUP | ||
|
||
""" | ||
Categorize files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you expand the documentation to explain the workings of this module at a high level?
|
||
options = [ | ||
PluggableCommandLineOption( | ||
("--file-cat",), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a more explicit, non-abbreviated command line option, like --categorize-file
or --file-category
to stay consistent with other options.
@classmethod | ||
def categorize(cls, resource): | ||
""" | ||
Return True if this Categorizer applies to this resource or False or None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return True if this Categorizer applies to this resource or False or None | |
Return True if this Categorizer category applies to this ``resource``. |
But what is resource
? a path string? or a location? some object? we need to be precise there.
resource.save(codebase) | ||
|
||
|
||
class Categorizer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add some docstring to document what is this class used for and how to extend it
class Categorizer: | |
class Categorizer: | |
""" | |
A categorizer can categorize a file Resource based on its path, name, | |
extension and other attributes such as file or mime type. | |
Subclasses must override the ``categorize`` class method. | |
""" |
|
||
class Categorizer: | ||
order = 0 | ||
analysis_priority = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be named importance or prominence?
analysis_priority seems to imply this is only use to drive some analysis?
analysis_priority = "1" | ||
file_category = "archive" | ||
file_subcategory = "debian" | ||
category_notes = "special type of archive" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just call this notes?
order = 0 | ||
analysis_priority = None | ||
file_category = None | ||
file_subcategory = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should always enforce these to be lowercase: we have a mix or capital case or lowercase subcategories at the moment.
@classmethod | ||
def categorize(cls, resource): | ||
return extension_in(resource.extension, [".bzl"]) or ( | ||
name_in(resource.name, ["build.bazel"]) and resource.type == "file" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is resource_type == file needed here? Are these not all about files?
|
||
@classmethod | ||
def categorize(cls, resource): | ||
return extension_uppercase_in(resource.extension, [".S"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this upper case?
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Reference: #2945 Signed-off-by: John M. Horan <[email protected]>
Reference: #2945 Signed-off-by: John M. Horan <[email protected]>
Reference: #2945 Signed-off-by: John M. Horan <[email protected]>
Reference: #2945 Signed-off-by: John M. Horan <[email protected]>
1f7d78f
to
e4f64d2
Compare
* Add in postscan plugin entry points Signed-off-by: John M. Horan <[email protected]>
This PR add a new file categorization plugin.