Trainer support for audio file, prompt pairs #20

deepglugs · 2023-07-25T16:49:12Z

Most of my data is split in file.wav and file.txt or in json files with "path/to/file.wav": "the transcription of the audio" mappings. It looks like Trainer only supports audio files. Is there a way to get prompt support?

deepglugs · 2023-07-28T16:16:15Z

I'm attempting to give this feature a try but I'm confused about the prompt input. process_prompt expects prompt.ndim==2 if it's a "raw prompt". In my brain a prompt is a single dimension of text: ["this is the prompt"]. What is the other dimension for?

lexkoro · 2023-07-28T19:11:53Z

(batch, embedding) would be = 2. Btw. this is still WIP don't think it is trainable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trainer support for audio file, prompt pairs #20

Trainer support for audio file, prompt pairs #20

deepglugs commented Jul 25, 2023

deepglugs commented Jul 28, 2023

lexkoro commented Jul 28, 2023

Trainer support for audio file, prompt pairs #20

Trainer support for audio file, prompt pairs #20

Comments

deepglugs commented Jul 25, 2023

deepglugs commented Jul 28, 2023

lexkoro commented Jul 28, 2023