-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Add Support for lxml >= 5.2.0 #217
fix: Add Support for lxml >= 5.2.0 #217
Conversation
Looks like the test failures are related to mf2py, not lxml. Looking into whether we should update the tests (probably) or pin mf2py < 2 (probably not) |
mf2py released v2 a few months ago which has a single breaking change: it includes image alts in microdata by default (see changelog in release). The reasoning for this is in this mf2py issue, but TL;DR the image alt has been part of the microformat spec for several years, and not including it was considered deprecated behavior (and now is removed entirely). To accommodate this I've updated the tests, though it's not explicitly related to the lxml update. |
Co-authored-by: James Addison <[email protected]>
lxml 5.1.0 removed _ElementStringResult which we use during dom traversal. This PR attempts to import _ElementStringResult as usual, but falls back to defining it explicitly if the import fails. This fixes #215
I chose to attempt to import first to preserve existing behavior if an older version of lxml is installed.
Additionally, we use the lxml HTML cleaner, which is now in its own separate package. Due to this I've added the optional html_clean package to the requirements.txt. As noted below, this raises a warning if using lxml < 5.2.0, but otherwise doesn't impact older versions.