-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
incomplete content on multiple pages #739
Comments
There are multiple mentions in the issues section about header content being removed erroneously. I came here to report the same thing happening on Hackaday.com/blog |
And https://www.thetimes.co.uk/ multiple articles, it clips the first one or two paragraphs on every page I'v tried. Kind of useeless in this state. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently on following pages the parser seems to be lost.
I don't see any markup problems.
maybe the newspapers detect and block the scraper?
https://www.derstandard.at/story/2000145508819/franzoesischer-verfassungsrat-stimmt-umstrittener-pensionsreform-zu
there an info is added to the text, that some "software" is blocking stuff and it should be removed
https://kurier.at/wirtschaft/atomausstieg-wie-die-abschaltung-eines-kernkraftwerks-funktioniert/402412829
only one line of text
thx for info. happy to help.
The text was updated successfully, but these errors were encountered: