-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi GPUs #28
Comments
hi, if you are using accelerate to distribute your model to multi-GPUs, you should add "LlamaDecoderLayer_KIVI" to the "no_split_module_class" like
and according to my experience, this may help
|
Could you please provide the original code for testing memory and multi-batch speed? |
I am not the paper author nor the repo owner... I am the one who opened issue24 several months ago... and I have never encountered this error before. |
I solved the problem, thank you very much for your help. |
I ran mem_spd_test.py and got the following error:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
I did not make any changes except the path of the model.
I manually changed the device and got the same error as #24
Any suggestions?
The text was updated successfully, but these errors were encountered: