Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to compute log mel spectrogram #36

Open
chigkim opened this issue May 20, 2024 · 5 comments
Open

failed to compute log mel spectrogram #36

chigkim opened this issue May 20, 2024 · 5 comments

Comments

@chigkim
Copy link

chigkim commented May 20, 2024

Hi,

I'm using M3 Max, and I built with CoreML support. When I run transcribe, it throws an error: "failed to compute log mel spectrogram."
I'm including the log below.
I'd appreciate any help! Thanks so much!

>>> from pywhispercpp.model import Model
>>> model = Model('models/ggml-medium.bin', n_threads=6)
[2024-05-20 07:48:47,691] {model.py:221} INFO - Initializing the model ...
whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-medium.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head  = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1024
whisper_model_load: n_text_head   = 16
whisper_model_load: n_text_layer  = 24
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 4 (medium)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M3 Max
ggml_metal_init: picking default device: Apple M3 Max
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = nil
ggml_metal_init: error: could not use bundle path to find ggml-metal.metal, falling back to trying cwd
ggml_metal_init: loading 'ggml-metal.metal'
ggml_metal_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:3:10: fatal error: 'ggml-common.h' file not found
#include "ggml-common.h"
         ^~~~~~~~~~~~~~~
" UserInfo={NSLocalizedDescription=program_source:3:10: fatal error: 'ggml-common.h' file not found
#include "ggml-common.h"
         ^~~~~~~~~~~~~~~
}
whisper_backend_init: ggml_backend_metal_init() failed
whisper_model_load:      CPU total size =  1533.14 MB
whisper_model_load: model size    = 1533.14 MB
whisper_init_state: kv self size  =  150.99 MB
whisper_init_state: kv cross size =  150.99 MB
whisper_init_state: kv pad  size  =    6.29 MB
whisper_init_state: loading Core ML model from 'models/ggml-medium-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
whisper_init_state: Core ML model loaded
whisper_init_state: compute buffer (conv)   =    8.81 MB
whisper_init_state: compute buffer (cross)  =    7.85 MB
whisper_init_state: compute buffer (decode) =  142.09 MB
>>> print(Model.system_info())
AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 1 | OPENVINO = 0
>>> segments = model.transcribe(file, speed_up=True)
[2024-05-20 07:49:11,958] {model.py:130} INFO - Transcribing ...
whisper_full_with_state: failed to compute log mel spectrogram
[2024-05-20 07:49:11,958] {model.py:133} INFO - Inference time: 0.000 s
@absadiki
Copy link
Owner

Hi @chigkim,

I think it's the same issue encountered in #35, please check this solution.

@chigkim
Copy link
Author

chigkim commented May 21, 2024

Thanks for the response.
After following the suggestion from #35, it doesn't complain about "fatal error: 'ggml-common.h' file not found" during loading the model.
However, model.transcribe still throws the error: "failed to compute log mel spectrogram."

>>> from pywhispercpp.model import Model
model = Model('models/ggml-medium.bin', n_threads=6)

>>> model = Model('models/ggml-medium.bin', n_threads=6)
[2024-05-21 11:27:12,237] {model.py:221} INFO - Initializing the model ...
whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-medium.bin'
whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw        = 0
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head  = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 1024
whisper_model_load: n_text_head   = 16
whisper_model_load: n_text_layer  = 24
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 4 (medium)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Apple M3 Max
ggml_metal_init: picking default device: Apple M3 Max
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: GGML_METAL_PATH_RESOURCES = /Users/cgk/Desktop/code/whisper.cpp/pywhispercpp/whisper.cpp
ggml_metal_init: loading '/Users/cgk/Desktop/code/whisper.cpp/pywhispercpp/whisper.cpp/ggml-metal.metal'
ggml_metal_init: GPU name:   Apple M3 Max
ggml_metal_init: GPU family: MTLGPUFamilyApple9  (1009)
ggml_metal_init: GPU family: MTLGPUFamilyCommon3 (3003)
ggml_metal_init: GPU family: MTLGPUFamilyMetal3  (5001)
ggml_metal_init: simdgroup reduction support   = true
ggml_metal_init: simdgroup matrix mul. support = true
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 51539.61 MB
whisper_model_load:    Metal total size =  1533.14 MB
whisper_model_load: model size    = 1533.14 MB
whisper_init_state: kv self size  =  150.99 MB
whisper_init_state: kv cross size =  150.99 MB
whisper_init_state: kv pad  size  =    6.29 MB
whisper_init_state: loading Core ML model from 'models/ggml-medium-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
whisper_init_state: Core ML model loaded
whisper_init_state: compute buffer (conv)   =    8.81 MB
whisper_init_state: compute buffer (cross)  =    7.85 MB
whisper_init_state: compute buffer (decode) =  142.09 MB
>>> 
>>> print(Model.system_info())
AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 0 | COREML = 1 | OPENVINO = 0
>>> segments = model.transcribe(file, speed_up=True)
[2024-05-21 11:27:58,939] {model.py:130} INFO - Transcribing ...
whisper_full_with_state: failed to compute log mel spectrogram
[2024-05-21 11:27:58,939] {model.py:133} INFO - Inference time: 0.000 s

@absadiki
Copy link
Owner

@chigkim,
Can you try without speed_up=True:

segments = model.transcribe(file)

@raivisdejus
Copy link

I had the same problem on Ubuntu 24.04 and I can confirm that removing speed_up=True fixed the issue

absadiki added a commit that referenced this issue Jul 16, 2024
update readme #36
@absadiki
Copy link
Owner

Okey, thanks @raivisdejus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants