You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am running the jellyfin in a cluster with only 1 GPU available, so I have to scale it to zero when not needed and scale it back from zero when requests come in, so that my other services can take the GPU occasionally. I noticed that the scaling happens really quick after the request is initiated from the client side. But the pod stays in 0/1 Running state for like 10 seconds before it can handle the requests.
Having inspected the templates I suspect that the readiness probe's periodSeconds field is in play. So I manually edited the sts manifest to reduce this field to 5, and it takes only around 5s to become ready
Reading the logs I realize that the server finishes preparation in 4 seconds, so a 5-second delay of readiness probe makes sense in my case(probably also works in many people's case as my machine is using a very outdated CPU).
So what is the best solution in this case? I want to reduce the periodSeconds but I am not sure if it should stay hard coded in the templates or extracted to the values for more flexibility.
Describe the solution you'd like.
reduced perioedSeconds in the template
Or
customizable probes
Describe alternatives you've considered.
Fork the chart? IDK...
Additional context.
No response
The text was updated successfully, but these errors were encountered:
Hey, thanks for your report. I'll have a proper look at the probes this afternoon, but I think it's also worth saying you can share a GPU between containers by specifying fractional requests, provided they sum to 1. You could set high requests for Jellyfin and it will get priority, but set a lower request for your other workload and it will also be able to use the GPU.
@djjudas21 Thanks for the suggestion. My gpu is 2060 super and does not support any GPU virtualization technique(MIG, vGPU etc). So I didn't even think of sharing it across multiple pods at the same time. I'll try what you said
Is your feature request related to a problem ?
Hi, I am running the jellyfin in a cluster with only 1 GPU available, so I have to scale it to zero when not needed and scale it back from zero when requests come in, so that my other services can take the GPU occasionally. I noticed that the scaling happens really quick after the request is initiated from the client side. But the pod stays in
0/1 Running
state for like 10 seconds before it can handle the requests.Having inspected the templates I suspect that the readiness probe's
periodSeconds
field is in play. So I manually edited the sts manifest to reduce this field to5
, and it takes only around 5s to become readyReading the logs I realize that the server finishes preparation in 4 seconds, so a 5-second delay of readiness probe makes sense in my case(probably also works in many people's case as my machine is using a very outdated CPU).
So what is the best solution in this case? I want to reduce the periodSeconds but I am not sure if it should stay hard coded in the templates or extracted to the values for more flexibility.
Describe the solution you'd like.
Or
Describe alternatives you've considered.
Fork the chart? IDK...
Additional context.
No response
The text was updated successfully, but these errors were encountered: