Skip to content
This repository has been archived by the owner on Sep 12, 2024. It is now read-only.

torch.cuda.OutOfMemoryError: CUDA out of memory. #164

Answered by SeeknnDestroy
BoyYangzai asked this question in Q&A
Discussion options

You must be logged in to vote

Hi @BoyYangzai,

Thanks for trying out the suggestion and providing detailed feedback. It seems that the root of the issue is related to the GPU memory requirements of the Llama2-7B-Chat model. This model requires approximately 30GB of GPU memory to run effectively, which exceeds the capacity of a single RTX 3090 GPU in your setup.

Since you have multiple GPUs, a parallel execution could theoretically solve this issue. However, setting up models to run in parallel across multiple GPUs can be quite complex and is not always straightforward.

A more practical solution in your case would be to deploy the Llama2-7B-Chat model to Hugging Face and use it via their hosted API. This way, you can of…

Replies: 6 comments

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by SeeknnDestroy
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #161 on November 29, 2023 16:51.