A proposal of a framework to handle multi-image augmentation (Including mosaic augmentation) #1420
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I propose a new batch-based augmentation framework as a natural extension of the
Compose
framework.With this feature, I want to make it easy to implement augmentations that use multiple image inputs, such as Mosaic and MixUp augmentations.
I hope you will be interested and would appreciate it if you could review this PR.
About the PR
As a natural extension of the
Compose
, I introduced a new batch-based compose,BatchedCompose
, and associate classes.In the
BatchedCompose
, we can seamlessly combine single-image and multi-image transforms with minimum constraints.The
BatchCompose
supports most features provided by theCompose
. And the existing transforms contained in theBatchCompose
, such asOneOf
andHorizontalFlip
, work as expected.To demonstrate how this framework will work, I include a complete implementation of a Mosaic augmentation in this PR.
Thanks to the new framework, the implementation and usage are much simpler and cleaner than my previous PR #1147.
This is an example:
(You can see the complete code in the last section below)
The inputs and output are here:
The
BatchCompose
expects batched targets as inputs, so the outputs also are batched.The
ForEach
is a helper container introduced in this PR. This works to bridge single-image transforms and multi-image transforms.The
Mosaic4
is an example of a multi-image transform.Note that this example uses all standard targets (
image
,bboxes
,mask
,keypoints
, andlabel_fields
), and they work as expected.Another demo demonstrates how powerful the framework is. (You do not need to understand the detail of the transforms).
Note that the input is a single-image batch. The
Repeat
is another helper container that applies transformsn
times and concatenates the output batches.Combining the
ForEach
and theRepeat
make this batch-based mechanism very powerful and flexible.How to use
The user should obey the following rule to work with the
BatchCompose
.I think complying with these constraints is not so hard and can cover most usecase.
_batch
suffixes. This rule is also applied to thelabel_fields
andadditional_targets
parameters.ForEach
.BatchBasedTransform
.Mosaic augmentation
This PR includes an implementation of the mosaic augmentation.
Implementation is straightforward, except that it is batch-based.
One notable feature I added is the
out_batch_size
parameter. This allows the user to specify the output's batch size.Similar behavior can be achieved using a
Repeat
container, but each internal transform can control the behavior in more detail.I think supporting
out_batch_size
can enrich the batch-base transform's flexibility.Compatibility
The
BatchCompose
is implemented as a subclass of theCompose
.I have modified the
Compose
to add customization points but kept the behavior unchanged.Future work
Now the
BatchBasedTransform
does not support the functionality of theReplayCompose
.If this PR can be approved, I will work on this in a different PR.
Currently, Mosaic augmentation is the only example. I hope this can accelerate support for other multi-image transforms.
Full example code
Most lines are for data preparation and visualization. The only important parts are those already quoted.