training data to TTS vice model #48

ctrlaltdle · 2023-06-28T21:43:42Z

ctrlaltdle
Jun 28, 2023

I am trying to make TTS models from audio clips that I have on my PC. I can train a model on the tab and from there I know I can send it to RVC, generate speech from TTS, and then send it to RVC. Then have my voice generated. I was wondering if there is a way to make the data that I made work strat in TTS? or if there I a tool to turn that .pth's to .npz model? If anyone got anything that would be much appreciated!

gitmylo · 2023-06-29T09:35:42Z

gitmylo
Jun 29, 2023
Maintainer

.npz isn't a model, it's a file containing info about the voice. (specifically the previous fine (encodec), coarse (encodec) and semantic (wav2vec) tokens)

You can create a custom .npz from a short audio clip in the clone section on the bark text to speech tts page. Make sure you have a good clip for cloning, optimally in .wav format (you can use ffmpeg to convert it).

Soon, serp-ai will release code for fine-tuning bark as well, i'll implement it as soon as i can. And see if i can implement some other things to make fine-tuning easier.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

training data to TTS vice model #48

{{title}}

Replies: 1 comment

{{title}}

Select a reply

training data to TTS vice model #48

ctrlaltdle Jun 28, 2023

Replies: 1 comment

gitmylo Jun 29, 2023 Maintainer

ctrlaltdle
Jun 28, 2023

gitmylo
Jun 29, 2023
Maintainer