First GPU occupies more VRAM in distributed training #66

suzhenghang · 2023-05-22T07:41:59Z

link，
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
cached_latent = torch.load(self.cached_data_list[index], map_location=device)
Otherwise, in multi-GPU distributed training, the first GPU may occupy excessive VRAM compared to the other GPUs.

ExponentialML added the bug Something isn't working label Jun 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First GPU occupies more VRAM in distributed training #66

First GPU occupies more VRAM in distributed training #66

suzhenghang commented May 22, 2023 •

edited

First GPU occupies more VRAM in distributed training #66

First GPU occupies more VRAM in distributed training #66

Comments

suzhenghang commented May 22, 2023 • edited

suzhenghang commented May 22, 2023 •

edited