Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llava-7b model Conversion to ONNX and Latency Optimization - OOM error (even after setting paging file size) #1144

Open
Harini-Vemula-2382 opened this issue May 9, 2024 · 2 comments

Comments

@Harini-Vemula-2382
Copy link

Describe the bug
When attempting to run the optimization process using the llm.py script, encountering an error not enough memory even after setting paging file size file to maximum.

To Reproduce
Steps to reproduce the behavior.

Expected behavior
Please update or give me resolution to solve this issue.

Olive config
Add Olive configurations here.

Olive logs
Add logs here.

Other information

  • OS: [e.g. Windows, Linux]
  • Olive version: [e.g. 0.4.0 or main]
  • ONNXRuntime package and version: [e.g. onnxruntime-gpu: 1.16.1]

Additional context
Please help to convert and execute LLava on DirectML.
llava_1
llava_2
Memory_Error

@jambayk
Copy link
Contributor

jambayk commented May 9, 2024

@PatriceVignola could you look at this? Thanks!

@PatriceVignola
Copy link
Contributor

Hi @Harini-Vemula-2382,

This error comes from DirectML itself, which indicates that the GPU doesn't have enough VRAM to load the model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants