Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix encoding bug in tests on Ruby 3 #101

Merged

Conversation

apainintheneck
Copy link
Contributor

We need to guess the HTML encoding here otherwise some tests fail.

Failures:

  1) Readability images should show one image, but outside of the best candidate
     Failure/Error: @input = @input.gsub(REGEXES[:replaceBrsRe], '</p><p>')

     ArgumentError:
       invalid byte sequence in UTF-8
     # ./lib/readability.rb:51:in `gsub'
     # ./lib/readability.rb:51:in `initialize'
     # ./spec/readability_spec.rb:80:in `new'
     # ./spec/readability_spec.rb:80:in `block (3 levels) in <top (required)>'

  2) Readability the cant_read.html fixture should work on the cant_read.html fixture with some allowed tags
     Failure/Error: @input = @input.gsub(REGEXES[:replaceBrsRe], '</p><p>')

     ArgumentError:
       invalid byte sequence in UTF-8
     # ./lib/readability.rb:51:in `gsub'
     # ./lib/readability.rb:51:in `initialize'
     # ./spec/readability_spec.rb:555:in `new'
     # ./spec/readability_spec.rb:555:in `block (3 levels) in <top (required)>'

Fixes #87

It also adds the latest Ruby 3 version to CI to test for these sort of bugs regularly.

We need to guess the HTML encoding here otherwise some tests fail.

```
Failures:

  1) Readability images should show one image, but outside of the best candidate
     Failure/Error: @input = @input.gsub(REGEXES[:replaceBrsRe], '</p><p>')

     ArgumentError:
       invalid byte sequence in UTF-8
     # ./lib/readability.rb:51:in `gsub'
     # ./lib/readability.rb:51:in `initialize'
     # ./spec/readability_spec.rb:80:in `new'
     # ./spec/readability_spec.rb:80:in `block (3 levels) in <top (required)>'

  2) Readability the cant_read.html fixture should work on the cant_read.html fixture with some allowed tags
     Failure/Error: @input = @input.gsub(REGEXES[:replaceBrsRe], '</p><p>')

     ArgumentError:
       invalid byte sequence in UTF-8
     # ./lib/readability.rb:51:in `gsub'
     # ./lib/readability.rb:51:in `initialize'
     # ./spec/readability_spec.rb:555:in `new'
     # ./spec/readability_spec.rb:555:in `block (3 levels) in <top (required)>'
```

Fixes cantino#87

It also adds the latest Ruby 3 version to CI to test for these sort of bugs regularly.
@cantino cantino merged commit e652671 into cantino:master Nov 23, 2024
2 checks passed
@cantino
Copy link
Owner

cantino commented Nov 23, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failing tests on Ruby 3.0 (invalid byte sequence in UTF-8)
2 participants