[BUG]safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer #661

chuangzhidan · 2024-04-30T06:28:35Z

Describe the bug
root@ac6edc15b00f:/workspace/quantization# python test_gptq.py
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 53%|██████████████████████████████████████████████▉ | 16/30 [09:03<07:55, 34.00s/it]
Traceback (most recent call last):
File "/workspace//quantization/test_gptq.py", line 27, in
model = AutoGPTQForCausalLM.from_pretrained(pretrained_model_dir, quantize_config)
File "/opt/conda/lib/python3.10/site-packages/auto_gptq/modeling/auto.py", line 76, in from_pretrained
return GPTQ_CAUSAL_LM_MODEL_MAP[model_type].from_pretrained(
File "/opt/conda/lib/python3.10/site-packages/auto_gptq/modeling/_base.py", line 787, in from_pretrained
model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path, **merged_kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
return model_class.from_pretrained(
File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4084, in _load_pretrained_model
state_dict = load_state_dict(shard_file, is_quantized=is_quantized)
File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 507, in load_state_dict
with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer

Software version
Linux ac6edc15b00f 5.4.0-177-generic #197-Ubuntu SMP Thu Mar 28 22:45:47 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Python 3.10.13
root@ac6edc15b00f:/workspace/quantization2# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
torch Version: 2.2.1
accelerate Version: 0.29.3
transformers Version: 4.40.1

To Reproduce

Expected behavior

Screenshots

Additional context
已经下载好的llama3 instruct，尝试量化，是因为显存不够了吗？

chuangzhidan added the bug Something isn't working label Apr 30, 2024

chuangzhidan changed the title ~~[BUG]~~ [BUG]safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer #661

[BUG]safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer #661

chuangzhidan commented Apr 30, 2024 •

edited

[BUG]safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer #661

[BUG]safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer #661

Comments

chuangzhidan commented Apr 30, 2024 • edited

chuangzhidan commented Apr 30, 2024 •

edited