Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training weirdness #225

Open
toninog opened this issue Dec 30, 2024 · 0 comments
Open

Training weirdness #225

toninog opened this issue Dec 30, 2024 · 0 comments

Comments

@toninog
Copy link

toninog commented Dec 30, 2024

Hi

I have a dataset all prepared and I can see that it is valid for the training. BUT when I run

torchrun --standalone --master_port 10902 train.py --c data/example/config.json --model speaker

I get this output
cannot reshape array of size 800000 into shape (200,1000,3)
0%| | 0/815 [00:10<?, ?it/s]
cannot reshape array of size 800000 into shape (200,1000,3)
0%| | 0/815 [00:12<?, ?it/s]
cannot reshape array of size 800000 into shape (200,1000,3)
0%| | 0/815 [00:12<?, ?it/s]

I can see from the train.log - that the training is "progressing"

2024-12-30 14:53:25,262 speaker INFO Train Epoch: 9999 [0%]
2024-12-30 14:53:25,263 speaker INFO [1.458371639251709, 3.5576884746551514, 15.082209587097168, 17.62735939025879, 2.3132169246673584, 1.7864117622375488, 0, 8.596621401483359e-05]
2024-12-30 14:53:38,876 speaker INFO Train Epoch: 10000 [0%]
2024-12-30 14:53:38,878 speaker INFO [1.440284252166748, 3.67047119140625, 14.131525993347168, 18.668804168701172, 2.2912511825561523, 2.109130620956421, 0, 8.595546823808173e-05]

But I never get past this on the torchrun
cannot reshape array of size 800000 into shape (200,1000,3)
0%| | 0/815 [00:12<?, ?it/s]

Any advise and help welcome

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant