You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
LocalAI version: localai/localai:v2.22.1-cublas-cuda12-ffmpeg@sha256:4ac028a056c946e047548e437952c80ba79b66af9428cbec628ab4ffedc47120 LocalAI Version v2.22.1 (015835dba2854572d50e167b7cade05af41ed214)
Environment, CPU architecture, OS, and Version:
While I am in docker, here is the bare meatal specs: Linux pop-os 6.9.3-76060903-generic #202405300957~1726766035~22.04~4092a0e SMP PREEMPT_DYNAMIC Thu S x86_64 x86_64 x86_64 GNU/Linux
Describe the bug
Backend of TTS is not selecting properly, is caching too long, or is otherwise not re-generating when it should.
To Reproduce
Run the following commands:
curl --request POST \
--url LocalAI_Instance/tts \
--header 'content-type: application/json' \
--data '{
"backend": "coqui",
"model": "tts_models/en/ljspeech/glow-tts",
"input":"Welcome back my friends to the show that never ends!"
}'|sha1sum
and
curl --request POST \
--url LocalAI_Instance/tts \
--header 'content-type: application/json' \
--data '{
"backend": "vall-e-x",
"input":"Welcome back my friends to the show that never ends!"
}'|sha1sum
the two sums will be the same. Since they came from two different backends they shouldn't be.
Expected behavior
I would expect that two different backends should not produce identical results when given multiple arbitrary inputs.
Logs
5:51AM INF Success ip=127.0.0.1 latency="39.134µs" method=GET status=200 url=/readyz
5:51AM DBG guessDefaultsFromFile: modelPath is empty
5:51AM DBG Request for model: tts_models/en/ljspeech/glow-tts
5:51AM INF Loading model '' with backend coqui
5:51AM DBG Model already loaded in memory:
5:51AM DBG Checking model availability ()
5:51AM INF Success ip=172.29.0.1 latency=169.709648ms method=POST status=200 url=/tts
5:52AM DBG guessDefaultsFromFile: modelPath is empty
5:52AM DBG Request for model:
5:52AM INF Loading model '' with backend vall-e-x
5:52AM DBG Model already loaded in memory:
5:52AM DBG Checking model availability ()
5:52AM INF Success ip=172.29.0.1 latency=70.942259ms method=POST status=200 url=/tts
Additional context
I intended to use the voice cloning, and the models trigger the correct backends to be prompted, however those files are also identical to the others.
The text was updated successfully, but these errors were encountered:
LocalAI version:
localai/localai:v2.22.1-cublas-cuda12-ffmpeg@sha256:4ac028a056c946e047548e437952c80ba79b66af9428cbec628ab4ffedc47120
LocalAI Version v2.22.1 (015835dba2854572d50e167b7cade05af41ed214)
Environment, CPU architecture, OS, and Version:
While I am in docker, here is the bare meatal specs:
Linux pop-os 6.9.3-76060903-generic #202405300957~1726766035~22.04~4092a0e SMP PREEMPT_DYNAMIC Thu S x86_64 x86_64 x86_64 GNU/Linux
Describe the bug
Backend of TTS is not selecting properly, is caching too long, or is otherwise not re-generating when it should.
To Reproduce
Run the following commands:
and
the two sums will be the same. Since they came from two different backends they shouldn't be.
Expected behavior
I would expect that two different backends should not produce identical results when given multiple arbitrary inputs.
Logs
Additional context
I intended to use the voice cloning, and the models trigger the correct backends to be prompted, however those files are also identical to the others.
The text was updated successfully, but these errors were encountered: