Apple MLX Powered Video Transcription

This Streamlit application allows users to upload video files and generate accurate transcripts using Apple's MLX framework.

Follow me on X: @RayFernando1337

YouTube: @RayFernando1337

MLX.Subtitle.Demo_720p-web.mp4

Important Note

⚠️ This application is designed to run on Apple Silicon (M series) Macs only. It utilizes the MLX framework, which is optimized for Apple's custom chips.

Getting Started

Prerequisites

An Apple Silicon (M series) Mac
Conda package manager

If you don't have Conda installed on your Mac, you can follow the Ultimate Guide to Installing Miniforge for AI Development on M1 Macs for a comprehensive setup process.

Installation

Clone the repository:

git clone https://github.com/RayFernando1337/MLX-Auto-Subtitled-Video-Generator.git;
cd MLX-Auto-Subtitled-Video-Generator

Create a new Conda environment with Python 3.12:

conda create -n mlx-whisper python=3.12;
conda activate mlx-whisper

Install the required dependencies:

xcode-select --install
pip install -r requirements.txt

Install FFmpeg (required for audio processing):
```
brew install ffmpeg
```
Note: If you don't have Homebrew installed, you can install it by running the following command in your terminal:
```
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
```
After installation, follow the instructions provided in the terminal to add Homebrew to your PATH. For more information about Homebrew, visit brew.sh.

Running the Application

To run the Streamlit application, use the following command:

streamlit run mlx_whisper_transcribe.py

Features

Upload video files (MP4, AVI, MOV, MKV)
Transcribe videos using various Whisper models
Generate VTT and SRT subtitle files
Download transcripts as a ZIP file

How It Works

Upload a video file
Choose a Whisper model
Click the "Transcribe" button to process the video
View the results and download the generated transcripts

Models

The application supports the following Whisper models:

Tiny (Q4)
Large v3
Small English (Q4)
Small (FP32)
Distil Large v3
Large v3 Turbo (New!)

Each model has different capabilities and processing speeds. Experiment with different models to find the best balance between accuracy and performance for your needs.

New Model: Large v3 Turbo

The newly added Large v3 Turbo model offers significant performance improvements:

Transcribes 12 minutes in 14 seconds on an M2 Ultra (~50X faster than real time)
Significantly smaller than the Large v3 model (809M vs 1550M)
It is multilingual

This model is particularly useful for processing longer videos or when you need quick results without sacrificing too much accuracy.

Troubleshooting

If you encounter any issues, please check the following:

Ensure you're using an Apple Silicon Mac
Verify that all dependencies are correctly installed
Check the console output for any error messages

For any persistent problems, please open an issue in the repository.

Acknowledgements

This project is a fork of the original Auto-Subtitled Video Generator by Batuhan Yilmaz. I deeply appreciate the contribution to the open-source community.

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.ai		.ai
.github		.github
.streamlit		.streamlit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
mlx_whisper_transcribe.py		mlx_whisper_transcribe.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Apple MLX Powered Video Transcription

Important Note

Getting Started

Prerequisites

Installation

Running the Application

Features

How It Works

Models

New Model: Large v3 Turbo

Troubleshooting

Acknowledgements

About

Releases

Sponsor this project

Packages

Contributors 4

Languages

License

RayFernando1337/MLX-Auto-Subtitled-Video-Generator

Folders and files

Latest commit

History

Repository files navigation

Apple MLX Powered Video Transcription

Important Note

Getting Started

Prerequisites

Installation

Running the Application

Features

How It Works

Models

New Model: Large v3 Turbo

Troubleshooting

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Contributors 4

Languages

Packages