Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] `gsub!': invalid byte sequence in UTF-8 (ArgumentError) #137

Open
boldyrev-insegment opened this issue Jan 14, 2019 · 5 comments
Open
Assignees
Labels

Comments

@boldyrev-insegment
Copy link

Some emails trigger the error in the subject in lib/sisimai/mime.rb:343

@boldyrev-insegment
Copy link
Author

Can be fixed by adding:

if ! getdecoded.valid_encoding?
  getdecoded = getdecoded.encode("UTF-16be", :invalid=>:replace, :replace=>"?").encode('UTF-8')
end

before:

getdecoded.gsub!(/\r\n/, "\n")  # Convert CRLF to LF

@azumakuniyuki
Copy link
Member

@boldyrev-insegment Thanks for the bug report. Does the following code fix the bug ?

getdecoded.scrub!('?')
getdecoded.gsub!(/\r\n/, "\n")  # Convert CRLF to LF

Best regards

@boldyrev-insegment
Copy link
Author

boldyrev-insegment commented Jan 31, 2019

@azumakuniyuki yes, it did fix the problem.

Sorry for such a late reply but I've tested the code on a bunch of different emails and found a few more bugs, so I had to tweak the code a little more :

-- 1 --

lib/sisimai/mime.rb:443:in `gsub!': invalid byte sequence in UTF-8 (ArgumentError)

add:
bodystring.scrub!('?')
before:
bodystring.gsub!(%r{^(Content-Type:\s*message/(?:rfc822|delivery-status)).+$}, '\1')

-- 2 --

lib/sisimai/mime.rb:435:in `makeflat': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)

replace
return hasflatten
with
return hasflatten.to_s.force_encoding("UTF-8")

Hope that helps!

@azumakuniyuki
Copy link
Member

@boldyrev-insegment Sorry for late response. Would you show me the email that causes the error (including entire message and email headers) at Private-Gist ? I will try to debug and fix the problem.

Best regards,

@azumakuniyuki azumakuniyuki self-assigned this Feb 5, 2019
@boldyrev-insegment
Copy link
Author

I'm sorry but I can't share it as far as the data is very sensitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants