Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STY: Minor code-style improvements for _reader.py #123

Open
wants to merge 46 commits into
base: main
Choose a base branch
from

Conversation

coco-speed
Copy link
Collaborator

This pull request was created automatically by CodSpeed to track performance changes of the pull request py-pdf/pypdf#2847.

The original branch is upstream/reader-minor-sty

stefan6419846 and others added 30 commits July 28, 2024 17:16
* DEV: Test against Python 3.13

* fix typo

* add missing setup-python

* fix another typo

* update Pillow version

* attempt to update coverage package

* update number of expected coverage files
…ayout mode (py-pdf#2788)

* Handle Sequence as an IndirectObject

The spec allows an int or float to be an IndirectObject as well, but this commit does not address that theoretical possibility.

* Update pypdf/_text_extraction/_layout_mode/_font.py

Co-authored-by: Stefan <[email protected]>

* Address PR comments

-Rename w_1 to w_next_entry
-Utilize ParseError instead of PdfReadError
-Write a test (both positive and negative)

* Handle unlikely case of IndirectObjects for float/int width elements

Also adds a comment to clarify that we don't explicitly handle the IndexError exception. Rather, we let it be raised as an IndexError.

* Yoda condition I removed

* Last commit was a bad patch, confused by non-committed changes

* Use test files from URL rather than resources

* Update tests/test_text_extraction.py

Co-authored-by: pubpub-zz <[email protected]>

* Fix code style warnings in range() call

---------

Co-authored-by: Stefan <[email protected]>
Co-authored-by: pubpub-zz <[email protected]>
Add compress_identical_objects().
Discovered in py-pdf#2728.
Closes py-pdf#2794.
Closes py-pdf#2768.
Cope with objects where the filter is ["/FlateDecode"] and/or where data has not been read yet.
* DEV: Fix coverage uploads

Starting 2024-09-02, hidden files are ignored by default: https://redirect.github.com/actions/upload-artifact/issues/602

* list files

* no need to list files
* STY: Use f-string = functionality

* STY: Use f-string = functionality

* STY: Use f-string = functionality

Also switch the order of a tuple to match the order of the line above.

---------

Co-authored-by: pubpub-zz <[email protected]>
visitor* function arguments are silently ignored when
extraction_mode="layout".  Document this a bit better and add a warning
when these arguments are ignored.

Closes py-pdf#2840.
pubpub-zz and others added 16 commits September 14, 2024 14:20
test_image_without_pillow runs a generated script which causes the
Python path to exclude the current directory.  The generated script
tries to import pypdf and either cannot find it or it finds the
version in pyenv instead of the version being tested.  Add "." to
PYTHONPATH so the correct version of pypdf is used.

Closes py-pdf#2849
Co-authored-by: pubpub-zz <[email protected]>
## Version 5.0.0, 2024-09-15

This version drops support for Python 3.7 (not maintained since July 2023), PdfMerger (use PdfWriter instead) and AnnotationBuilder (use annotations instead).


### Deprecations (DEP)
- Remove the deprecated PfdMerger and AnnotationBuilder classes and other deprecations cleanup (py-pdf#2813)
- Drop Python 3.7 support (py-pdf#2793)

### New Features (ENH)
- Add capability to remove /Info from PDF (py-pdf#2820)
- Add incremental capability to PdfWriter (py-pdf#2811)
- Add UniGB-UTF16 encodings (py-pdf#2819)
- Accept utf strings for metadata (py-pdf#2802)
- Report PdfReadError instead of RecursionError (py-pdf#2800)
- Compress PDF files merging identical objects (py-pdf#2795)

### Bug Fixes (BUG)
- Fix sheared image (py-pdf#2801)

### Robustness (ROB)
- Robustify .set_data() (py-pdf#2821)
- Raise PdfReadError when missing /Root in trailer (py-pdf#2808)
- Fix extract_text() issues on damaged PDFs (py-pdf#2760)
- Handle images with empty data when processing an image from bytes (py-pdf#2786)

### Developer Experience (DEV)
- Fix coverage uploads (py-pdf#2832)
- Test against Python 3.13 (py-pdf#2776)


[Full Changelog](py-pdf/pypdf@4.3.1...5.0.0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants