Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: tolerate truncated files and no warning when jumping startxref #122

Open
wants to merge 36 commits into
base: main
Choose a base branch
from

Commits on Jul 28, 2024

  1. DEV: Test against Python 3.13 (py-pdf#2776)

    * DEV: Test against Python 3.13
    
    * fix typo
    
    * add missing setup-python
    
    * fix another typo
    
    * update Pillow version
    
    * attempt to update coverage package
    
    * update number of expected coverage files
    stefan6419846 authored Jul 28, 2024
    Configuration menu
    Copy the full SHA
    4bd54bd View commit details
    Browse the repository at this point in the history

Commits on Jul 31, 2024

  1. STY: Remove boolean value comparison (py-pdf#2779)

    PEP 8 recommendation.
    j-t-1 authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    d4df20d View commit details
    Browse the repository at this point in the history

Commits on Aug 2, 2024

  1. Configuration menu
    Copy the full SHA
    3ad9234 View commit details
    Browse the repository at this point in the history
  2. SEC: Fix GitHub workflow vulnerable to script injection (py-pdf#2787)

    Signed-off-by: Diogo Teles Sant'Anna <[email protected]>
    diogoteles08 authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    582557e View commit details
    Browse the repository at this point in the history

Commits on Aug 5, 2024

  1. Configuration menu
    Copy the full SHA
    38f3925 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    09f9b7e View commit details
    Browse the repository at this point in the history
  3. BUG: Handle Sequence as an IndirectObject when extracting text with l…

    …ayout mode (py-pdf#2788)
    
    * Handle Sequence as an IndirectObject
    
    The spec allows an int or float to be an IndirectObject as well, but this commit does not address that theoretical possibility.
    
    * Update pypdf/_text_extraction/_layout_mode/_font.py
    
    Co-authored-by: Stefan <[email protected]>
    
    * Address PR comments
    
    -Rename w_1 to w_next_entry
    -Utilize ParseError instead of PdfReadError
    -Write a test (both positive and negative)
    
    * Handle unlikely case of IndirectObjects for float/int width elements
    
    Also adds a comment to clarify that we don't explicitly handle the IndexError exception. Rather, we let it be raised as an IndexError.
    
    * Yoda condition I removed
    
    * Last commit was a bad patch, confused by non-committed changes
    
    * Use test files from URL rather than resources
    
    * Update tests/test_text_extraction.py
    
    Co-authored-by: pubpub-zz <[email protected]>
    
    * Fix code style warnings in range() call
    
    ---------
    
    Co-authored-by: Stefan <[email protected]>
    Co-authored-by: pubpub-zz <[email protected]>
    3 people authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    b2d7204 View commit details
    Browse the repository at this point in the history

Commits on Aug 7, 2024

  1. Configuration menu
    Copy the full SHA
    5abd590 View commit details
    Browse the repository at this point in the history

Commits on Aug 12, 2024

  1. Configuration menu
    Copy the full SHA
    219eb13 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    46c89dd View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    a9758ae View commit details
    Browse the repository at this point in the history

Commits on Aug 13, 2024

  1. ENH: Compress PDF files merging identical objects (py-pdf#2795)

    Add compress_identical_objects().
    Discovered in py-pdf#2728.
    Closes py-pdf#2794.
    Closes py-pdf#2768.
    pubpub-zz authored Aug 13, 2024
    Configuration menu
    Copy the full SHA
    cf7fcfd View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2eb565d View commit details
    Browse the repository at this point in the history

Commits on Aug 14, 2024

  1. Configuration menu
    Copy the full SHA
    d9a8c54 View commit details
    Browse the repository at this point in the history

Commits on Aug 15, 2024

  1. Configuration menu
    Copy the full SHA
    799630d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    454a62a View commit details
    Browse the repository at this point in the history

Commits on Aug 16, 2024

  1. Configuration menu
    Copy the full SHA
    0c81f3c View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2024

  1. Configuration menu
    Copy the full SHA
    d2d520b View commit details
    Browse the repository at this point in the history

Commits on Aug 23, 2024

  1. Configuration menu
    Copy the full SHA
    9f08cd0 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b7b3c8c View commit details
    Browse the repository at this point in the history

Commits on Aug 27, 2024

  1. Configuration menu
    Copy the full SHA
    f55d332 View commit details
    Browse the repository at this point in the history

Commits on Aug 28, 2024

  1. Configuration menu
    Copy the full SHA
    38ea8c5 View commit details
    Browse the repository at this point in the history

Commits on Aug 29, 2024

  1. ROB: Robustify .set_data() (py-pdf#2821)

    Cope with objects where the filter is ["/FlateDecode"] and/or where data has not been read yet.
    pubpub-zz authored Aug 29, 2024
    Configuration menu
    Copy the full SHA
    82eac7e View commit details
    Browse the repository at this point in the history

Commits on Sep 5, 2024

  1. DEV: Fix coverage uploads (py-pdf#2832)

    * DEV: Fix coverage uploads
    
    Starting 2024-09-02, hidden files are ignored by default: https://redirect.github.com/actions/upload-artifact/issues/602
    
    * list files
    
    * no need to list files
    stefan6419846 authored Sep 5, 2024
    Configuration menu
    Copy the full SHA
    e694d55 View commit details
    Browse the repository at this point in the history

Commits on Sep 6, 2024

  1. DOC: Small changes to PaperSize notes (py-pdf#2834)

    Plus one typo in xmp.py.
    j-t-1 authored Sep 6, 2024
    Configuration menu
    Copy the full SHA
    b85c171 View commit details
    Browse the repository at this point in the history

Commits on Sep 11, 2024

  1. Configuration menu
    Copy the full SHA
    98d4425 View commit details
    Browse the repository at this point in the history

Commits on Sep 13, 2024

  1. Configuration menu
    Copy the full SHA
    9d54f63 View commit details
    Browse the repository at this point in the history
  2. STY: Use f-string = functionality (py-pdf#2835)

    * STY: Use f-string = functionality
    
    * STY: Use f-string = functionality
    
    * STY: Use f-string = functionality
    
    Also switch the order of a tuple to match the order of the line above.
    
    ---------
    
    Co-authored-by: pubpub-zz <[email protected]>
    j-t-1 and pubpub-zz authored Sep 13, 2024
    Configuration menu
    Copy the full SHA
    c4e95bd View commit details
    Browse the repository at this point in the history

Commits on Sep 14, 2024

  1. BUG: Warn when visitor* arguments are ignored (py-pdf#2845)

    visitor* function arguments are silently ignored when
    extraction_mode="layout".  Document this a bit better and add a warning
    when these arguments are ignored.
    
    Closes py-pdf#2840.
    kaos-ocs authored Sep 14, 2024
    Configuration menu
    Copy the full SHA
    78baa8f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a790532 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1bbc301 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    8ebd311 View commit details
    Browse the repository at this point in the history

Commits on Sep 15, 2024

  1. BUG: test_image_without_pillow cannot find pypdf (py-pdf#2850)

    test_image_without_pillow runs a generated script which causes the
    Python path to exclude the current directory.  The generated script
    tries to import pypdf and either cannot find it or it finds the
    version in pyenv instead of the version being tested.  Add "." to
    PYTHONPATH so the correct version of pypdf is used.
    
    Closes py-pdf#2849
    kaos-ocs authored Sep 15, 2024
    Configuration menu
    Copy the full SHA
    8eefba8 View commit details
    Browse the repository at this point in the history

Commits on Sep 17, 2024

  1. REL: 5.0.0 (py-pdf#2851)

    ## Version 5.0.0, 2024-09-15
    
    This version drops support for Python 3.7 (not maintained since July 2023), PdfMerger (use PdfWriter instead) and AnnotationBuilder (use annotations instead).
    
    
    ### Deprecations (DEP)
    - Remove the deprecated PfdMerger and AnnotationBuilder classes and other deprecations cleanup (py-pdf#2813)
    - Drop Python 3.7 support (py-pdf#2793)
    
    ### New Features (ENH)
    - Add capability to remove /Info from PDF (py-pdf#2820)
    - Add incremental capability to PdfWriter (py-pdf#2811)
    - Add UniGB-UTF16 encodings (py-pdf#2819)
    - Accept utf strings for metadata (py-pdf#2802)
    - Report PdfReadError instead of RecursionError (py-pdf#2800)
    - Compress PDF files merging identical objects (py-pdf#2795)
    
    ### Bug Fixes (BUG)
    - Fix sheared image (py-pdf#2801)
    
    ### Robustness (ROB)
    - Robustify .set_data() (py-pdf#2821)
    - Raise PdfReadError when missing /Root in trailer (py-pdf#2808)
    - Fix extract_text() issues on damaged PDFs (py-pdf#2760)
    - Handle images with empty data when processing an image from bytes (py-pdf#2786)
    
    ### Developer Experience (DEV)
    - Fix coverage uploads (py-pdf#2832)
    - Test against Python 3.13 (py-pdf#2776)
    
    
    [Full Changelog](py-pdf/pypdf@4.3.1...5.0.0)
    pubpub-zz authored Sep 17, 2024
    Configuration menu
    Copy the full SHA
    637bc44 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c00ec60 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7994a83 View commit details
    Browse the repository at this point in the history