Skip to content

deonblaauw/Text-To-Video-AI

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

76 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI-Powered Video Generation Script

GitHub stars GitHub forks GitHub license Python version Platform


macOS


🌟 Overview

The AI-Powered Video Generation Script is an automated tool that creates engaging videos from a given topic using advanced AI technologies. It integrates text generation, text-to-speech (TTS), background image/video generation, captioning, and video rendering into a seamless workflow. This script empowers content creators to produce high-quality videos without manual intervention.


Showcase

Simple_practices_to_cultivate_daily_grat.mp4
Delicious_low_calorie_quick_meals_to_get.mp4

Acknowledgements

Kudos to SamurAIGPT for developing the original version of this project and laying a great foundation.

🎬 Key Features

  • AI-Driven Script Generation: Generate video scripts based on any topic using OpenAI's GPT models.
  • Text-to-Speech Conversion: Convert scripts into speech using OpenAI's TTS engine or Microsoft Edge's TTS (OpenAI TTS is the default).
  • Dynamic Background Generation:
    • ComfyUI Integration: Generate background images locally using ComfyUI workflows.
    • DALLΒ·E Integration: Fetch AI-generated images from OpenAI's DALLΒ·E model.
    • Pexels Integration: Download relevant videos from Pexels' vast library.
  • Automated Captioning: Create timed captions synced with the audio.
  • Music and Sound Effects: Add background music to enhance the video's appeal.
  • Customizable Output: Support for landscape or portrait orientation, adjustable video duration, and more.

πŸ› οΈ Architecture

graph TD
    A[Input Topic] --> B[Script Generation]
    B --> C[Text-to-Speech Conversion]
    C --> D[Caption Generation]
    B --> E[Background Search Queries]
    E --> F[Background Generation]
    D --> G[Video Rendering]
    F --> G
    C --> G
    G --> H[Final Video Output]
Loading

Figure: High-level architecture of the AI-Powered Video Generation Script.


πŸš€ Getting Started

1. Clone the Repository

git clone [email protected]:deonblaauw/Text-To-Video-AI.git
cd Text-To-Video-AI

2. Install Dependencies

pip install -r requirements.txt

3. Set Up API Keys

The script requires API keys for OpenAI and optionally for Pexels if you plan to use Pexels as a background video source.

OpenAI API Key

  • Sign up or log in to your OpenAI account.
  • Navigate to the API section and generate a new secret key.

Pexels API Key (Optional)

Set Environment Variables

You can set your API keys as environment variables in your shell.

For macOS and Linux users:

export OPENAI_KEY="your_openai_api_key_here"
export PEXELS_KEY="your_pexels_api_key_here"

To make these environment variables persist across terminal sessions, you can add them to your shell configuration file.

Updating Your .zshrc File on macOS

If you're using the Zsh shell (default on macOS Catalina and later), you can add the environment variables to your .zshrc file.

  1. Open the .zshrc file in your home directory:

    nano ~/.zshrc
  2. Add the following lines at the end of the file:

    export OPENAI_KEY="your_openai_api_key_here"
    export PEXELS_KEY="your_pexels_api_key_here"
  3. Save the file by pressing Control + X, then Y, and Enter.

  4. Reload your terminal or source the .zshrc file:

    source ~/.zshrc

Note: Replace "your_openai_api_key_here" and "your_pexels_api_key_here" with your actual API keys.

4. Configure ComfyUI (Optional)

If you plan to use ComfyUI for background image generation:

  • Install and set up ComfyUI locally.
  • Enable "Developer Mode" in ComfyUI settings.
  • Save your ComfyUI workflow as workflow_api.json in the utility/comfyui/ directory of this project.

πŸ“– Usage

Run the script by providing a topic and desired options:

python app.py "Your video topic here" [options]

Example

Generate a 60-second landscape video about "Facts about submarines" using OpenAI's TTS (default) and save outputs in my_videos directory:

python app.py "Facts about submarines" \
  --num_vids 1 \
  --landscape \
  --output_dir my_videos \
  --duration 60 \
  --video_server comfyui

Command-Line Arguments

Argument Type Default Description
topic string Required The topic for the video.
--landscape flag Portrait mode Generate video in landscape mode (default is portrait).
--tts string openai Text-to-speech engine (openai or edge). Default is openai.
--output_dir string generated_outputs Directory to store the generated outputs.
--duration int 40 Duration of the video in seconds.
--num_vids int 1 Number of videos to generate.
--video_server string comfyui Source for background images/videos (comfyui, dall-e, or pexel).

🌐 Background Image/Video Sources

ComfyUI (Recommended)

  • Local image generation using custom workflows.
  • Setup:
    • Install ComfyUI and enable developer mode.
    • Save your workflow as workflow_api.json in utility/comfyui/.

DALLΒ·E

  • AI-generated images from OpenAI's DALLΒ·E model.
  • API Key: Ensure your OpenAI API key has access to DALLΒ·E.

Pexels

  • Stock videos from Pexels' library.
  • API Key: Set your Pexels API key as an environment variable (PEXELS_KEY).

🧩 Components Breakdown

1. Script Generation

Generates a script based on the provided topic.

generate_script(topic, provider, model, duration)

2. Hashtag Generation

Creates relevant hashtags for social media platforms.

generate_hashtags(script, provider, model)

3. Text-to-Speech (TTS)

Converts the script into speech using the selected TTS engine.

  • OpenAI TTS (Default):

    generate_audio_openai(script, filename, voice)
  • Microsoft Edge TTS:

    generate_audio_edge(script, filename, voice)

4. Caption Generation

Generates timed captions synced with the audio.

generate_timed_captions(audio_filename)

5. Background Generation

Fetches or generates background images/videos.

generate_video_url(response, provider, model, search_terms, video_server, orientation_landscape)

6. Video Rendering

Combines all elements into the final video.

get_output_media(
    topic,
    audio_filename,
    captions,
    background_videos,
    video_server,
    landscape,
    music_volume_wav,
    music_volume_mp3,
    volume_tts,
    output_dir,
    music_file_path
)

πŸ“Š Workflow Diagram

sequenceDiagram
    participant User
    participant ScriptGenerator
    participant TTSEngine
    participant CaptionGenerator
    participant BackgroundGenerator
    participant VideoRenderer
    User->>ScriptGenerator: Provide Topic
    ScriptGenerator->>ScriptGenerator: Generate Script
    ScriptGenerator->>TTSEngine: Send Script
    TTSEngine->>TTSEngine: Convert to Speech
    TTSEngine->>CaptionGenerator: Provide Audio
    CaptionGenerator->>CaptionGenerator: Generate Captions
    ScriptGenerator->>BackgroundGenerator: Provide Script
    BackgroundGenerator->>BackgroundGenerator: Fetch/Generate Media
    TTSEngine->>VideoRenderer: Provide Audio
    CaptionGenerator->>VideoRenderer: Provide Captions
    BackgroundGenerator->>VideoRenderer: Provide Media
    VideoRenderer->>VideoRenderer: Render Final Video
    VideoRenderer->>User: Deliver Video
Loading

Figure: End-to-end workflow of the video generation process.


🎨 Customization

Orientation

  • Portrait (default): Ideal for mobile platforms like TikTok or Instagram Reels.
  • Landscape: Suitable for YouTube or desktop viewing.

Use the --landscape flag to switch to landscape mode.

Text-to-Speech Voices

  • OpenAI TTS: Offers a range of AI voices.
  • Edge TTS: Provides voices available in Microsoft Edge.

Video Duration

Adjust using the --duration argument (in seconds).

Number of Videos

Generate multiple videos by setting --num_vids greater than 1.


βš™οΈ Configuration

Environment Variables

Store your API keys as environment variables.

For macOS and Linux users:

export OPENAI_KEY="your_openai_api_key_here"
export PEXELS_KEY="your_pexels_api_key_here"

Updating Your .zshrc File on macOS

  1. Open the .zshrc file in your home directory:

    nano ~/.zshrc
  2. Add the following lines at the end of the file:

    export OPENAI_KEY="your_openai_api_key_here"
    export PEXELS_KEY="your_pexels_api_key_here"
  3. Save the file by pressing Control + X, then Y, and Enter.

  4. Reload your terminal or source the .zshrc file:

    source ~/.zshrc

Note: Replace "your_openai_api_key_here" and "your_pexels_api_key_here" with your actual API keys.

Dependencies

All required Python packages are listed in requirements.txt.

Install them using:

pip install -r requirements.txt

πŸ“ Examples

Generate a Single Video

python app.py "The wonders of the ocean"

Generate Multiple Landscape Videos with OpenAI TTS

python app.py "Innovations in renewable energy" \
  --num_vids 3 \
  --landscape \
  --output_dir renewable_videos \
  --duration 60

🐞 Troubleshooting

  • Missing API Keys: Ensure all necessary API keys are set as environment variables.
  • ComfyUI Errors: Verify that ComfyUI is correctly installed and the workflow file is in the right location.
  • Dependencies Issues: Reinstall dependencies using pip install -r requirements.txt.
  • Environment Variables Not Recognized: Make sure you have updated your .zshrc file and sourced it, or set the environment variables in your current session.

🀝 Contributing

Contributions are welcome! Please open an issue or submit a pull request on GitHub.


πŸ“„ License

This project is licensed under the MIT License.


πŸ“§ Contact

For questions or support, please contact deonblaauw.


🌈 Acknowledgements

  • OpenAI for GPT and DALLΒ·E.
  • Pexels for providing access to a vast library of stock videos.
  • ComfyUI for local image generation workflows.
  • MoviePy for video editing capabilities.

πŸ“š Resources


πŸŽ‰ Let's Get Started!

Unleash your creativity and start generating amazing videos effortlessly. Whether you're a content creator, marketer, or just someone who loves experimenting with AI technologies, this tool is designed to make video creation accessible and fun.


Happy Creating! 🎬✨


Note: The diagrams are included using Mermaid syntax, which is supported by GitHub and some markdown viewers. To view the rendered diagrams, ensure you're viewing this README on a platform that supports Mermaid diagrams, such as GitHub or GitLab.


About

Generate video from text using AI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.6%
  • Python 0.4%