Add support for specifying part sizes and stdin as input #3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi, thanks for this great tool.
I took a stab at addressing #1 to allow piping directly to tarsplitter from tar itself when input is specified as '-', which avoids the need for a large intermediary tar file. For example:
tar -cvf - . | tarsplitter -i - -s 1G -o /tmp/archive-
. This will create tar files that are at most 1GB in size, though individual sizes will vary depending on the input files and how they're sorted.This comes at a cost of an external dependency (https://github.com/c2h5oh/datasize), but I hope you'll agree that it's best to not reinvent the wheel for parsing human-readable sizes.
The
-p
option doesn't make sense when input is stdin since we can't know the total input size in advance to calculate the part size. Similarly, if-s
is not provided when input is stdin no splitting will occur and in both cases only a single tar file will be created, which defeats the purpose of using tarsplitter, so the user should ensure to always specify-s
with-i -
. Maybe we should enforce this explicitly, but I didn't think it was necessary.On a separate note, I didn't test this with
-m archive
, which I think should be removed from tarsplitter, leaving this functionality to tar itself now that stdin is supported. I would also consider deprecating-p
since the user usually wants control over the part size and not the quantity of files produced.Ideally we should have unit/functional tests for all this, but I'll leave that for another PR. :)
Cheers,
Ivan