Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trainer support for audio file, prompt pairs #20

Open
deepglugs opened this issue Jul 25, 2023 · 2 comments
Open

Trainer support for audio file, prompt pairs #20

deepglugs opened this issue Jul 25, 2023 · 2 comments

Comments

@deepglugs
Copy link

Most of my data is split in file.wav and file.txt or in json files with "path/to/file.wav": "the transcription of the audio" mappings. It looks like Trainer only supports audio files. Is there a way to get prompt support?

@deepglugs
Copy link
Author

I'm attempting to give this feature a try but I'm confused about the prompt input. process_prompt expects prompt.ndim==2 if it's a "raw prompt". In my brain a prompt is a single dimension of text: ["this is the prompt"]. What is the other dimension for?

@lexkoro
Copy link

lexkoro commented Jul 28, 2023

(batch, embedding) would be = 2. Btw. this is still WIP don't think it is trainable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants