Skip to content

Kuber144/Melotts-and-Gemini

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Melotts and Gemini

This is a demo showing how Melotts and Gemini can be used together to create a chatbot that talks to you.

MeloTTS is a high-quality multi-lingual text-to-speech library by MyShell.ai. You can read more about it here.

In this demo, I used melotts to convert the response recieved from Google's Gemini API to convert the response into an audio file and then used playsound to play the file to make it seem as though the LLM is talking to us.

The model of Melotts was installed locally on my machine and was running locally. (See installation instructions here) Google Gemini was called using the code provided in gemini_main.py file. It waits for the text to be entered by the user and creates a chat with the LLM model. As soon as a response is recieved, it calls the melo_api.py file and function in it to convert the text to speech.

Melotts is a fast tts model and has several configuration options and the user can choose them and modify the code as needed. (As this is only a demonstration more information about the options can be found here)

A video showing how the model is running

2024-03-07.19-31-36.mp4

This demo can be taken further by using the nodejs implimentation of Gemini to further improve the functionality.