Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[email protected] is not supported due to requires device with capability > (8, 0) but your GPU has capability (6, 1) (too old) #79

Open
kamlesh0606 opened this issue Jul 10, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@kamlesh0606
Copy link

kamlesh0606 commented Jul 10, 2024

Python Version

Python 3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0]

Pip Freeze

absl-py==2.1.0
annotated-types==0.7.0
attrs==23.2.0
docstring_parser==0.16
filelock==3.15.4
fire==0.6.0
fsspec==2024.6.1
grpcio==1.64.1
Jinja2==3.1.4
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
Markdown==3.6
MarkupSafe==2.1.5
mistral_common==1.2.1
mpmath==1.3.0
networkx==3.3
numpy==1.25.2
nvidia-cublas-cu12==12.1.3.1
nvidia-cuda-cupti-cu12==12.1.105
nvidia-cuda-nvrtc-cu12==12.1.105
nvidia-cuda-runtime-cu12==12.1.105
nvidia-cudnn-cu12==8.9.2.26
nvidia-cufft-cu12==11.0.2.54
nvidia-curand-cu12==10.3.2.106
nvidia-cusolver-cu12==11.4.5.107
nvidia-cusparse-cu12==12.1.0.106
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.5.82
nvidia-nvtx-cu12==12.1.105
protobuf==4.25.3
pydantic==2.6.1
pydantic_core==2.16.2
PyYAML==6.0.1
referencing==0.35.1
rpds-py==0.19.0
safetensors==0.4.3
sentencepiece==0.1.99
simple_parsing==0.1.5
six==1.16.0
sympy==1.13.0
tensorboard==2.17.0
tensorboard-data-server==0.7.2
termcolor==2.4.0
torch==2.2.0
tqdm==4.66.4
triton==2.2.0
typing_extensions==4.12.2
Werkzeug==3.0.3
xformers==0.0.24

Reproduction Steps

  1. torchrun --nproc-per-node 1 -m train example/7B.yaml

And Get Error Like

NotImplementedError: No operator found for memory_efficient_attention_forward with inputs:
query : shape=(1, 8192, 32, 128) (torch.bfloat16)
key : shape=(1, 8192, 32, 128) (torch.bfloat16)
value : shape=(1, 8192, 32, 128) (torch.bfloat16)
attn_bias : <class 'xformers.ops.fmha.attn_bias.BlockDiagonalCausalMask'>
p : 0.0
[email protected] is not supported because:
requires device with capability > (8, 0) but your GPU has capability (6, 1) (too old)

bf16 is only supported on A100+ GPUs
tritonflashattF is not supported because:
requires device with capability > (8, 0) but your GPU has capability (6, 1) (too old)
attn_bias type is <class 'xformers.ops.fmha.attn_bias.BlockDiagonalCausalMask'>
bf16 is only supported on A100+ GPUs
operator wasn't built - see python -m xformers.info for more info
triton is not available
requires GPU with sm80 minimum compute capacity, e.g., A100/H100/L4
cutlassF is not supported because:
bf16 is only supported on A100+ GPUs
smallkF is not supported because:
max(query.shape[-1] != value.shape[-1]) > 32
dtype=torch.bfloat16 (supported: {torch.float32})
attn_bias type is <class 'xformers.ops.fmha.attn_bias.BlockDiagonalCausalMask'>
bf16 is only supported on A100+ GPUs
unsupported embed per head: 128
[2024-07-10 15:30:11,478] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 206239) of binary: /opt/mistral-finetune-main/my_venv/bin/python3.10
Traceback (most recent call last):
File "/opt/mistral-finetune-main/my_venv/bin/torchrun", line 8, in
sys.exit(main())

have any other option for Cuda - device_capability version 6.1 vision To mistral-finetune ?

Expected Behavior

have any other option for Cuda - device_capability version 6.1 vision To mistral-finetune ?

Additional Context

python -m xformers.info

xFormers 0.0.24
memory_efficient_attention.cutlassF: available
memory_efficient_attention.cutlassB: available
memory_efficient_attention.decoderF: available
[email protected]: available
[email protected]: available
memory_efficient_attention.smallkF: available
memory_efficient_attention.smallkB: available
memory_efficient_attention.tritonflashattF: unavailable
memory_efficient_attention.tritonflashattB: unavailable
memory_efficient_attention.triton_splitKF: unavailable
indexing.scaled_index_addF: unavailable
indexing.scaled_index_addB: unavailable
indexing.index_select: unavailable
sequence_parallel_fused.write_values: unavailable
sequence_parallel_fused.wait_values: unavailable
sequence_parallel_fused.cuda_memset_32b_async: unavailable
sp24.sparse24_sparsify_both_ways: available
sp24.sparse24_apply: available
sp24.sparse24_apply_dense_output: available
sp24._sparse24_gemm: available
[email protected]: available
swiglu.dual_gemm_silu: available
swiglu.gemm_fused_operand_sum: available
swiglu.fused.p.cpp: available
is_triton_available: False
pytorch.version: 2.2.0+cu121
pytorch.cuda: available
gpu.compute_capability: 6.1
gpu.name: NVIDIA GeForce GTX 1080
dcgm_profiler: unavailable
build.info: available
build.cuda_version: 1201
build.python_version: 3.10.13
build.torch_version: 2.2.0+cu121
build.env.TORCH_CUDA_ARCH_LIST: 5.0+PTX 6.0 6.1 7.0 7.5 8.0+PTX 9.0
build.env.XFORMERS_BUILD_TYPE: Release
build.env.XFORMERS_ENABLE_DEBUG_ASSERTIONS: None
build.env.NVCC_FLAGS: None
build.env.XFORMERS_PACKAGE_FROM: wheel-v0.0.24
build.nvcc_version: 12.1.66
source.privacy: open source

Suggested Solutions

No response

@kamlesh0606 kamlesh0606 added the bug Something isn't working label Jul 10, 2024
@bpcanedo
Copy link

Similar issue using colab's T4

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants