Skip to content

Written Audio Uses Google Text to Speech engine and a configuration file to create Audio files for videos

Notifications You must be signed in to change notification settings

rupin/WrittenAudio

Repository files navigation

WrittenAudio

Written Audio Uses Google Text to Speech engine and a configuration file to create Audio files for videos.

Features

  • Uses a .xlsx file with time markers and text to generate Audio
  • Combines individual audio files in sequence. Add gaps between Audio files as silence.
  • Add gradual transitions of music in silent gaps, creating ambient background music in the Audio file.
  • Uses Google Cloud Text to Speech to generate audio, so audio is clean from ruffles and noise.

Planned Features

  • Potential to convert English text to other languages, and generate audio files for other languages.
  • Integrate AWS Polly (More feature rich and better sounding)
  • Morph into a Web Front End

Usage

  
usage: main.py [-h] -x X [-r R] [-o O] [-m M] [-v V]

optional arguments:
  -h, --help  show this help message and exit
  
  -x X        The Input .xlsx file
  
  -r R        The Row inside the .xlsx which should be individually processed
  
  -o O        The name of the Output File. Example output.wav
  
  -m M        Combine Audio with Overlay Music File. Example music_pop.wav
  
  -v V        Control the volume of the Music File. Negative values will
              reduce volume, positive values will increase it.
              

Before Getting Started

Install Python

Install Python

Download Dependencies

Download the code, then run

pip3 install -r requirements.txt

This command will install all required dependencies for your to run the program.

Create your .xlsx file

  • The Application takes its input from a .xlsx file.
  • The code assumes the first row will be a header with two columns Time and Text
  • The first column in this .xlsx file should have the time markers in seconds.
  • The second column should have the text that should start at the time marker.

Look at the included Audio Sequence.xlsx file to understand better.

Create Google API credentials and enable the Cloud to Text API.Assign these credentials for the current user terminal session.

export GOOGLE_APPLICATION_CREDENTIALS="credentials/credentials.json"

How to Use the Application

Create Individual Audio files from .xlsx file

python3 main.py -x 'Audio Sequence.xlsx'

Create Individual Audio file for a specific row.

After Creating all files, you reviewed them, and you can change the text or time for a few rows. Edit a specific row in your .xlsx file (Time or Text or both can be changed). Change multiple rows and process them individually.

python3 main.py -x 'Audio Sequence.xlsx' -r 3

Combine Files with silent periods between audio utterances

python3 main.py -x 'Audio Sequence.xlsx' -o 'output.wav'

Use the Audio Sequence.xlsx file to generate audio files (it skips files which are already generated), and combine those into one single output.wav file.

Combine Files with silent periods replaced with music

python3 main.py -x 'Audio Sequence.xlsx' -o 'audiooutput.wav' -m background_music.wav

  • Use the Audio Sequence.xlsx file to generate audio files (it skips files which are already generated), and combine those into one single audiooutput.wav file.
  • Additionally, it creates another output file audiooutput_background_music.wav which contains sections of the silent periods replaced with sections from the audio file background_music.wav.
  • The background music also plays at low volume ( ambient music) when the audio is in the parts where there is human voice.
  • There is a 1 second fade-in and fade-outs, during transitions.
  • If the music file is shorter than the file generated by the audio sequence, the music file is repeated and then made the same length as the audio file.

Optional( Volume Control)

python3 main.py -x 'Audio Sequence.xlsx' -o 'audiooutput.wav' -m background_music.wav -v -10

  • When the -m argument is present, adding a -v argument will help control the amplitude of the music.
  • Adding a negative number will reduce volume, while adding a positive number increases it. The number is in dB.
  • -v 4 increases the volume by 4 dB.
  • -v -3 reduces the volume by 3dB

Possible Errors

  1. When you add text in your .xlsx file, it is possible that the audio generated for it can overlap with the audio for the second row. For example, one of your rows contains text "Paris is the Capital of France, and is the most populous city in France" at 30 seconds. After conversion to audio, this line of text is 10 seconds long. So the audio will start playing the above line at 30 seconds, and continue to play till 40 seconds. If the next row contains 'The language most Parisians speak is French' and if it starts at 38 seconds, the user gets an error to abruptly stop the conversion or continue.

  2. Files used in the options not present at the right locations.

  3. Google API for Cloud to Speech is not enabled or has not been associated with the session.

  4. Input is an .mp3 file, but the application works with .wav files only. use command ffmpeg -i file.mp3 file.wav to convert a .mp3 file to .wav file

License

Copyright 2019 Rupin Chheda

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

About

Written Audio Uses Google Text to Speech engine and a configuration file to create Audio files for videos

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages