Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-GPU training is reducing speed compared to single GPU #217

Open
tanvir-utexas opened this issue Jun 12, 2022 · 0 comments
Open

Multi-GPU training is reducing speed compared to single GPU #217

tanvir-utexas opened this issue Jun 12, 2022 · 0 comments

Comments

@tanvir-utexas
Copy link

For training with both the baseline and soft-teacher configs, I am always getting much slower training with more gpus. For training with 1% label, the single gpu training shows 2 days of approximated training while 8 gpus shows 5 days of approximated training. I don't understand the underlying reason. I am using 8 A5000 GPU node. Can anyone tell how long should it take? What can I do to get the speedup from multi-gpu training? I am badly stuck on this. Any help will be greatly appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant