Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi GPUs #28

Open
yisunlp opened this issue Aug 30, 2024 · 5 comments
Open

Multi GPUs #28

yisunlp opened this issue Aug 30, 2024 · 5 comments

Comments

@yisunlp
Copy link

yisunlp commented Aug 30, 2024

I ran mem_spd_test.py and got the following error:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1!
I did not make any changes except the path of the model.
I manually changed the device and got the same error as #24
Any suggestions?

image

@xzwj1699
Copy link

xzwj1699 commented Sep 2, 2024

hi, if you are using accelerate to distribute your model to multi-GPUs, you should add "LlamaDecoderLayer_KIVI" to the "no_split_module_class" like

device_map = infer_auto_device_map(
                model, no_split_module_classes=["LlamaDecoderLayer_KIVI"], ****map_kwargs)**

and according to my experience, this may help

# this is the original code located in KIVI/quant/new_pack.py:232
# _minmax_along_last_dim[grid](data, mn, mx,
      data.numel(), data.shape[0], num_groups, group_size,
      BLOCK_SIZE_N=BLOCK_SIZE_N, num_warps=8) 

# modified code
with torch.cuda.device(data.device):
  _minmax_along_last_dim[grid](data, mn, mx,
        data.numel(), data.shape[0], num_groups, group_size,
        BLOCK_SIZE_N=BLOCK_SIZE_N, num_warps=8) 
# some other code...
with torch.cuda.device(data.device):
  _pack_along_last_dim[grid](bit, data, code, data.shape[0], 
	data.shape[1], feat_per_int, 
	BLOCK_SIZE_N=BLOCK_SIZE_N, 
	num_warps=8)

@yisunlp
Copy link
Author

yisunlp commented Sep 2, 2024

I changed my code and got
image

@yisunlp
Copy link
Author

yisunlp commented Sep 2, 2024

Could you please provide the original code for testing memory and multi-batch speed?

@xzwj1699
Copy link

xzwj1699 commented Sep 2, 2024

I am not the paper author nor the repo owner... I am the one who opened issue24 several months ago... and I have never encountered this error before.
Good luck.

@yisunlp
Copy link
Author

yisunlp commented Sep 2, 2024

I solved the problem, thank you very much for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants