-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the speed of multi-gpu training #9
Comments
Hi, can I ask how much GPU memory can afford the training of this model? I need to evaluate if my GPU memory is enough to try it. |
@sunnyHelen ~18G. |
Ok. Thanks a lot~ |
@LiewFeng |
Hi, @Cc-Hy .Sorry for the late reply. The command is the same as that provided by the GETTING_STARTED.md. I didn't modify the batch size. |
Experiments are conducted on kitti train. |
@LiewFeng So I think your 2 GPU training time is normal. But if your GPUs are really working at very low utilization, you may check your CPU status. I once met this situation where my CPU was suffering from a bottleneck and the GPU could not work fully. |
Hi, @Cc-Hy . I figure it out. The reason is the version of pytorch. When I run the experiment with 1 GPU, the pytorch version is 1.10. When I try to run with 2 GPUs, it gets stuck. Then I turn to pytorch 1.8 and it can work, but 2x slower. I am using A 100. It's about 2x faster than 3090. I still get stuck with 2GPU. It seems it's solved in OpenPCDet. Sadly, it doesn't work for me. |
Problem of getting stuck fixed here and it works for me. |
Hi, @Cc-Hy .When I train the model on kitti train, 2 GPUs takes more time than 1 GPU, which is really strange. Do you encounter this pthenomenon?
The text was updated successfully, but these errors were encountered: