ToTensorV2 - How to convert mask format from uint8 to float32? #1273

rfmiotto · 2022-09-01T18:36:02Z

rfmiotto
Sep 1, 2022

Hi,

I am performing a segmentation task using Albumentations and PyTorch. However, during my first attempts, I noticed that the mask has type uint8, which ends up causing a runtime error in PyTorch: RuntimeError: Expected floating point type for target with class probabilities, got Byte.

In the end, I figured out that it was necessary to change the format of the mask using mask_img = mask_img.astype(np.float32) (indicated in the code below). But my question is: should Albumentations take care of it?

Let me describe the situation:

Say I created a custom dataset that uses OpenCV to fit nicely with Albumentations:

class MyCustomDataset(Dataset):
    def __init__(self, csv_file, transform = None):
        self.files = pd.read_csv(csv_file)
        self.transform = transform
        
    def __len__(self):
        return len(self.files)
    
    def __getitem__(self, index):
        input_img_path = self.files.inputs[index]
        input_img_bgr = cv2.imread(input_img_path)
        input_img = cv2.cvtColor(input_img_bgr, cv2.COLOR_BGR2RGB)
        
        mask_img_path = self.files.masks[index]
        mask_img_bgr = cv2.imread(mask_img_path)
        mask_img = cv2.cvtColor(mask_img_bgr, cv2.COLOR_BGR2RGB)
        mask_img = mask_img.astype(np.float32) # NEED TO CHANGE FORMAT HERE!
        
        if self.transform:
            augmentation = self.transform(image=input_img, mask=mask_img)
            input_img = augmentation["image"]
            mask_img = augmentation["mask"]
            
        return input_img, mask_img

Then, after applying some transformations, I use ToTensorV2 to convert the image to the format PyTorch expects (CHW - which stands for "channel, height, width"). Notice that the option transpose_mask is used since the mask was also loaded with OpenCV. By doing so, both input image and mask are "CHW".

transform_train = A.Compose([
    A.Normalize(
        mean=[0, 0, 0],
        std=[1, 1, 1],
        max_pixel_value=255.0,
    ),
    ToTensorV2(transpose_mask=True)
])

dataset = MyCustomDataset("dataset.csv", transform=transform_train)

Then, I create a data loader. However, if I DONT INCLUDE mask_img = mask_img.astype(np.float32) in MyCustomDataset, then the formats of the input image and the mask become different.

data_loader = DataLoader(dataset)

for x, y in data_loader:
    print(x.dtype, y.dtype) # returns torch.float32 torch.uint8

Isn't ToTensorV2 supposed to change the formats?

By inspecting the source code (see the two links below), I noticed that the functions img_to_tensor and mask_to_tensor are not being called in ToTensorV2, but they used to be in the deprecated ToTensor. I am not sure why they were removed. It seems that these functions were forgotten and they are simply floating in the code.

https://github.com/albumentations-team/albumentations/blob/master/albumentations/pytorch/transforms.py#L73

https://vfdev-5-albumentations.readthedocs.io/en/docs_pytorch_fix/_modules/albumentations/pytorch/transforms.html

So I'm left with the question: is it calling mask_img = mask_img.astype(np.float32) the best way to handle this situation?

Answered by Dipet

Sep 2, 2022

Looks like we need to add control flag to change mask dtype inside ToTensorV2.

So I'm left with the question: is it calling mask_img = mask_img.astype(np.float32) the best way to handle this situation?

It is better to call this mask = mask.to(torch.float32) after calling transform, because processing a uint8 mask is much faster.

View full answer

Dipet · 2022-09-02T08:36:30Z

Dipet
Sep 2, 2022
Maintainer

Looks like we need to add control flag to change mask dtype inside ToTensorV2.

So I'm left with the question: is it calling mask_img = mask_img.astype(np.float32) the best way to handle this situation?

It is better to call this mask = mask.to(torch.float32) after calling transform, because processing a uint8 mask is much faster.

1 reply

rfmiotto Sep 2, 2022
Author

Dear @Dipet,

Thank you for your quick answer!
I look forward to seeing this inclusion on ToTensorV2. Meanwhile I will use your approach.

All the best

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

albumentations-team

ToTensorV2 - How to convert mask format from uint8 to float32? #1273

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

albumentations-team

ToTensorV2 - How to convert mask format from uint8 to float32? #1273

rfmiotto Sep 1, 2022

Replies: 1 comment · 1 reply

Dipet Sep 2, 2022 Maintainer

rfmiotto Sep 2, 2022 Author

rfmiotto
Sep 1, 2022

Replies: 1 comment 1 reply

Dipet
Sep 2, 2022
Maintainer

rfmiotto Sep 2, 2022
Author