-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add runtimeclass nvidia as a default option for nimcache #177
Comments
this happened also on nimservices |
Thanks for the suggestion. We'll add it to our backlog. In the meantime, recommend adding webhooks to add runtimeclass. |
Can i use the patch command for NIMCACHE ? |
@jxdn thanks for catching this. We didn't hit this error as |
This PR should fix for NIM Service deployments. For caching, we should not need to specify "nvidia" runtime class as that Job can be run on a non-gpu node. For the issue reported the fix should be in |
Hi ,
can help to add runtimeclass on the nimcache and all others crd ?
got this error
Traceback (most recent call last):
File "/usr/local/bin/download-to-cache", line 5, in
from vllm_nvext.hub.pre_download import download_to_cache
File "/usr/local/lib/python3.10/dist-packages/vllm_nvext/hub/pre_download.py", line 20, in
from vllm_nvext.hub.ngc_injector import get_optimal_manifest_config
File "/usr/local/lib/python3.10/dist-packages/vllm_nvext/hub/ngc_injector.py", line 22, in
from vllm.engine.arg_utils import AsyncEngineArgs
File "/usr/local/lib/python3.10/dist-packages/vllm/init.py", line 3, in
from vllm.engine.arg_utils import AsyncEngineArgs, EngineArgs
File "/usr/local/lib/python3.10/dist-packages/vllm/engine/arg_utils.py", line 6, in
from vllm.config import (CacheConfig, DecodingConfig, DeviceConfig,
File "/usr/local/lib/python3.10/dist-packages/vllm/config.py", line 12, in
from vllm.model_executor.layers.quantization import QUANTIZATION_METHODS
File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/quantization/init.py", line 3, in
from vllm.model_executor.layers.quantization.aqlm import AQLMConfig
File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/layers/quantization/aqlm.py", line 11, in
from vllm._C import ops
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory
The text was updated successfully, but these errors were encountered: