Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ordered Output of Split Pages #25

Open
TheOwl57 opened this issue May 27, 2021 · 5 comments
Open

Ordered Output of Split Pages #25

TheOwl57 opened this issue May 27, 2021 · 5 comments
Labels
enhancement New feature or request

Comments

@TheOwl57
Copy link

TheOwl57 commented May 27, 2021

Awesome module which I have used to sort through large PDF files at incredible speeds.

First time posting anything on GitHub, so I hope this is acceptable.

Only issue I have is when splitting documents with a large amount of pages, the naming convention of the [CustomeSplitter] Class names the file based on the page number. This can make it hard to then correctly read through split files in order.

Suggest expanding the file name to include leading zeros. I have successfully been able to modify the [CustomSplitter] Class to do this with the below code:

class CustomSplitter : iText.Kernel.Utils.PdfSplitter {
    [int] $_order
    [string] $_destinationFolder
    [string] $_outputName

    CustomSplitter([iText.Kernel.Pdf.PdfDocument] $pdfDocument, [string] $destinationFolder, [string] $OutputName) : base($pdfDocument) {
        $this._destinationFolder = $destinationFolder
        $this._order = 1
        $this._outputName = $OutputName
    }

    [iText.Kernel.Pdf.PdfWriter] GetNextPdfWriter([iText.Kernel.Utils.PageRange] $documentPageRange) {
        $Name = -join ($this._outputName, $this._order.ToString("D4"), ".pdf")
        $Path = [IO.Path]::Combine($this._destinationFolder, $Name)
        $this._order++
        return [iText.Kernel.Pdf.PdfWriter]::new($Path)
    }
}

"$this._order = 1" as a start for page 1.
"$this._order.ToString("D4")" will handle files that are up to 9999 pages long, so shouldn't push the limits too often.
"$this._order++" to increment to the next page number.

Ideally if I had time, I would expand this to look at the file prior to splitting to get the total amount of pages and adjust how many leading zeros are required so that the naming convention was dynamic based on the content at the time.

Tested this to work with both 0.0.10 and 0.0.17.

Thanks again for the module.

@PrzemyslawKlys
Copy link
Member

This seems like a nice idea. Using Get-PDFDetails one could get a number of pages, based on that add leading zero's to make it nice and pretty for naming convention.

$NumberOfPages = 10000
$number = 100
([string]$number).PadLeft($NumberOfPages.ToString().length,'0')

@PrzemyslawKlys PrzemyslawKlys added the enhancement New feature or request label May 27, 2021
@PrzemyslawKlys
Copy link
Member

@TheOwl57 would you consider making a PR?

@TheOwl57
Copy link
Author

Sorry, very new to GitHub and trying to figure it out, but yeah I would happy to create a PR. I have gone further and have some ideas on how to get the padding on the fly. Something like:

$Reader = [iText.Kernel.Pdf.PdfReader]::New($File)
$PDFLength = ([iText.Kernel.Pdf.PdfDocument]::new($Reader).GetNumberOfPages()).ToString().Length
$Order.ToString("D$($PDFLength)")

@PrzemyslawKlys
Copy link
Member

The easiest way to "manage PR" is to follow what I've written in #12 and do it from GitHub GUI.

However I would encourage you to "learn" GitHub a bit as it will come useful in the future. Let me know if you would be able to make that PR?

@rpascolo
Copy link

rpascolo commented Jul 18, 2023

This is what I use (PSWritePDF.psm1)

class CustomSplitter : iText.Kernel.Utils.PdfSplitter {
[int] $_order
[string] $_destinationFolder
[string] $_outputName
[string] $_Mask

CustomSplitter([iText.Kernel.Pdf.PdfDocument] $pdfDocument, [string] $destinationFolder, [string] $OutputName) : base($pdfDocument) {
$this._destinationFolder = $destinationFolder
$this._order = 1 # commencer à 1 au lieu de 0
$this._outputName = $OutputName
$this._Mask = ("0" * ($pdfDocument.GetNumberOfPages()).ToString().Length)
}

[iText.Kernel.Pdf.PdfWriter] GetNextPdfWriter([iText.Kernel.Utils.PageRange] $documentPageRange) {
$Name = -join ($this._outputName, $this._order.ToString($this._Mask), ".pdf")
$this._order++
$Path = [IO.Path]::Combine($this._destinationFolder, $Name)
return [iText.Kernel.Pdf.PdfWriter]::new($Path)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants