Whisper+Coreml

Whisper+Coreml speeds up decoder and encoder by using Apple Neural Engine (ANE).

Usage

# 1. convert encoder, decoder to coreml model and build shared library
#    (small model: 70s, large model: 5mins)
./convert_coreml.sh [tiny|small|medium|...] [beam_size]

# 2. transcribe
python -m whisper YOUR_WAV_FILE --language=[ja|en|...] --model=$1 --beam_size=beam_size --best_of=beam_size --word_timestamps=True --use_coreml=True

# Known constraints:
# 1. beam_size and best_of are fixed on each coreml model
# 2. specifying --language is required

Performance

transcribe() 1 mins song on Macbook M1 Air 16GB when beam_size=1

Model Size	1st load time	cached load time	transcribe time (bs=1)
small (whisper+coreml ane)	47s	1s	load time + 2s
small (openai/whisper cpu)			9s
large (whisper+coreml ane)	3m20s	9s	load time + 10s
large (openai/whisper cpu)			42s

transcribe() 1 mins song on Macbook M1 Air 16GB when beam_size=5 (default option of openai/whisper)

Model Size	1st load time	cached load time	transcribe time (bs=5)
small (whisper+coreml ane)	55s	1s	load time + 4s
small (openai/whisper cpu)			27s
large (whisper+coreml ane)	3m57s	10s	load time + 23s
large (openai/whisper cpu)			122s

Note: Transcribe time only measure the time of transcribe() in transcribe.py. Python model load time is not included in transcribe time.

Known issues:

Loading coreml model for first time takes a long time on ANECompilerService (small model:50s, large model: 3m20s)
Decoder256 of large model runs on GPU (memory issue? on M1 Air 16GB)

Name		Name	Last commit message	Last commit date
Latest commit History 216 Commits
.github/workflows		.github/workflows
coreml		coreml
data		data
notebooks		notebooks
tests		tests
whisper		whisper
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
approach.png		approach.png
convert_ckv.py		convert_ckv.py
convert_coreml.sh		convert_coreml.sh
convert_decoder.py		convert_decoder.py
convert_decoder256.py		convert_decoder256.py
convert_encoder.py		convert_encoder.py
language-breakdown.svg		language-breakdown.svg
model-card.md		model-card.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
whisper_readme.md		whisper_readme.md

License

wangchou/whisper.coreml

Folders and files

Latest commit

History

Repository files navigation

Whisper+Coreml

Usage

Performance

Known issues:

About

Resources

License

Stars

Watchers

Forks

Languages