Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate how images are resized in the repo #89

Open
tlpss opened this issue Sep 15, 2023 · 2 comments
Open

Evaluate how images are resized in the repo #89

tlpss opened this issue Sep 15, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@tlpss
Copy link
Contributor

tlpss commented Sep 15, 2023

Recently I've discovered that cv2.resize and PIL.Image.resize have quite different implementations.

In particular, opencv does not scale the size of the filter used in downscaling (i.e. even if you downscale 100 times, only the neighbouring pixels are used to calculate the interpolation), which results in aliasing artifacts. Most notably, sharp edges will have 'staircase' artifacts, which increase with the size of the downscaling. (funny sidenote downscaling twice 2x will give better results than downscaling once 4x). Pillow uses adaptive filter size, which avoids aliasing by 'smoothing' the image. The downscaled image will contain no artifacts but will be a bit 'blurred'.

More information: https://arxiv.org/pdf/2104.11222.pdf, https://zuru.tech/blog/the-dangers-behind-image-resizing
illustration:
image

This also impacts Deep learning training. I've found that mAP can differ up to 5% when resizing with PIL on the train set and with cv2 on the test set versus using the same resize method for both (downscaling with factor 4).

Long story short, we should probably make sure that we consistently use the same downsampling method as much as possible in the repo to make sure that the artifacts do not influence inference/test performance.

I've now used PILLOW's bicubic interpolation in the coco tools and suggest to set that as default throughout the repo.

@Victorlouisdg, the image_transforms in the camera toolkit is the first place in the repo that comes to mind but there might be other places where we downscale images.

@tlpss tlpss added the enhancement New feature or request label Sep 15, 2023
@Victorlouisdg Victorlouisdg self-assigned this Dec 5, 2023
@Victorlouisdg
Copy link
Contributor

I ran some quick performance tests. OpenCV is at least 10x faster than PIL at downscaling from (1920, 1080) to (320, 240). Because the ImageTransforms are also intended for real-time visualization, I'll likely leave OpenCV as the default. However, I have to investigate the quality concerns further. Possible actions:

  • Use cv2.INTER_BICUBIC instead of the default cv2.INTER_LINEAR because speed is similar
  • Add constructor argument to Resize class to select between OpenCV en PIL resize implementation.

image

@Victorlouisdg
Copy link
Contributor

Notebook to run the comparison: https://gist.github.com/Victorlouisdg/27c22053e9c9be2a71f2912aaecb3431

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants