-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow skipping bad characters #80
Comments
I'd like to do this. I document how to do this in Python2. It's far more challenging in Python3, unfortunately. Python3 by default assumes that all text is perfectly encoded and there are never any mistakes, and that if there's an encoding mistake, users like programs to crash. An assumption that fails when it encounters the real world. I could modify the call to open() to add errors= with options for 'ignore' (ignores errors), 'replace' (causes a replacement marker (such as '?'), or 'surrogateescape'. See: https://docs.python.org/3/library/functions.html#open That would have to be called specially (differently) in Python2: https://docs.python.org/2/library/functions.html#open |
You can define a function to be open in python2 and open with errors='replace' in python3, then use that function instead of calling open directly. |
Alternatively, you can add a command line flag that shouldn't be used in python2. e.g. --encoding-errors replace (which will pass errors='replace'), and if that flag isn't given - the default behavior (not passing an errors arg) will be used. If you pass this flag in python2 you'll get an error, but you don't need it in python2 if I understood correctly... |
My intent is to make it "easy to use by default". I'd prefer to not need the extra flag. |
Hi!
I've run into the encoding issue several times on different projects. I look at a source code with thousands (or hundreds of thousands) of files, and inside them, for some unknown reason, there are some invalid characters here and there.
Since I'm running flawfinder as a "best-effort" way of trying to find vulnerabilities, I don't really care if a few lines (or files) are skipped here and there. Would it be possible to add a command line flag to allow skipping bad characters? Often they appear in code comments and are irrelevant anyways. Converting the entire code is not always easy/possible/helpful. I would like an option that will have flawfinder skip bad characters if possible, the entire line if not, and the entire file if even skipping the entire line is not possible.
Thanks!
The text was updated successfully, but these errors were encountered: