-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing issues on theverge.com #3787
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Some extraneous content and inappropriate formatting are happening on Verge articles. One example:
Native website (https://www.theverge.com/2024/4/3/24119918/elon-musk-reputation-impact-tesla-falling-sales):
Omnivore version:
Pullquotes also tend to become odd in Omnivore. On the website, they stand out outside the normal flow of text, but Omnivore parses them as regular paragraphs. This is confusing to read, since often they're seemingly random repeats of text you've already read, or are about to read.
Setting them off as a quote passage, or even better removing them altogether, would make the text more readable.
Native website (https://www.theverge.com/24094310/vice-media-layoffs-bankruptcy-shane-smith):
Omnivore version:
The text was updated successfully, but these errors were encountered: