Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wav2vec2-object of type 'NoneType' has no len() #5558

Open
Linx3f opened this issue Oct 20, 2024 · 1 comment
Open

wav2vec2-object of type 'NoneType' has no len() #5558

Linx3f opened this issue Oct 20, 2024 · 1 comment

Comments

@Linx3f
Copy link

Linx3f commented Oct 20, 2024

🐛 Bug(I have seen all issue about this error. But My situation is different.)

I follow the offical method (https://github.com/facebookresearch/fairseq/tree/main/examples/wav2vec) to use other datasets to fine-tune wav2vec_small_10m.pt. However, there is a TypeError: object of type 'NoneType' has no len()

Traceback (most recent call last):
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq_cli\hydra_train.py", line 27, in hydra_main
    _hydra_main(cfg)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq_cli\hydra_train.py", line 56, in _hydra_main
    distributed_utils.call_main(cfg, pre_main, **kwargs)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\distributed\utils.py", line 369, in call_main
    main(cfg, **kwargs)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq_cli\train.py", line 96, in main
    model = task.build_model(cfg.model)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\tasks\audio_finetuning.py", line 193, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\tasks\audio_pretraining.py", line 197, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\tasks\fairseq_task.py", line 338, in build_model
    model = models.build_model(cfg, self, from_checkpoint)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\models\__init__.py", line 106, in build_model
    return model.build_model(cfg, task)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\models\wav2vec\wav2vec2_asr.py", line 208, in build_model
    w2v_encoder = Wav2VecEncoder(cfg, len(task.target_dictionary))
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\models\wav2vec\wav2vec2_asr.py", line 407, in __init__
    model = task.build_model(w2v_args.model, from_checkpoint=True)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\tasks\audio_pretraining.py", line 197, in build_model
    model = super().build_model(model_cfg, from_checkpoint)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\tasks\fairseq_task.py", line 338, in build_model
    model = models.build_model(cfg, self, from_checkpoint)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\models\__init__.py", line 106, in build_model
    return model.build_model(cfg, task)
  File "D:\Apps\Anaconda3\envs\torch182\lib\site-packages\fairseq\models\wav2vec\wav2vec2_asr.py", line 208, in build_model
    w2v_encoder = Wav2VecEncoder(cfg, len(task.target_dictionary))
TypeError: object of type 'NoneType' has no len()

To Reproduce

  1. Get dict.ltr.txt of the datasets I used.
| 345995
E 149656
T 126396
O 126124
A 105266
I 98021
N 87693
H 78905
S 75716
R 65274
L 56135
U 53373
Y 49207
D 47160
W 37983
M 37210
G 34109
C 28421
F 21527
B 21383
K 20813
P 20423
' 19381
V 12276
J 4387
X 1863
Z 1067
Q 597
  1. Modify the filebase_100h.yaml.The modified part is as follows.
  ...
task:
  _name: audio_finetuning
  data: C:\Users\18310\Desktop\py\feature-extraction2\trans //only dict.ltr.txt there
  normalize: false
  labels: ltr
model:
  _name: wav2vec_ctc
  w2v_path: C:\Users\18310\Desktop\py\feature-extraction2\model\wav2vec_small_10m.pt
  apply_mask: true
  ...
  1. Run cmd fairseq-hydra-train distributed_training.distributed_world_size=1 --config-dir C:\Users\18310\Desktop\py\feature-extraction2\config\finetuning --config-name base_100h .
  2. See error as above.

Expected behavior

fine-tune the wav2vec model.

Environment

  • fairseq Version: 0.12.2
  • PyTorch Version: 1.8.2
  • OS: Windows
  • Python version:3.8
  • CUDA/cuDNN version:11.1

Additional context

Well, I have another question. As written in the README.md, Fine-tuning a model requires parallel audio and labels file, as well as a vocabulary file in fairseq format, but why does the command line given only include the vocabulary file?

@XR1988
Copy link

XR1988 commented Dec 9, 2024

我在跑wav2vec-u的时候,如果使用wav2vec2 的高级模型就会有这个问题,模型换成普通的就没事了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants