-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add suspend and resume events to tensor filter #4424
Labels
Comments
cibot: Thank you for posting issue #4424. The person in charge will reply soon. |
11 tasks
We need to clarify the policy and the behaviors of SUSPEND/RESUME. I presume the followings:
E.g.,
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add suspend and resume events to tensor filter
Consider a scenario where AI services are being provided on devices with limited memory capacity.
Since user requests do not occur continuously, it may be inefficient to keep loading the model onto memory.
This is more necessary when working with larger sized models (e.g., Large Language Models).
Thus, rather than maintaining the model loaded onto memory, it is more efficient to recall and utilize the model as needed.
For instance, if there are no requests to the model for 3 seconds, then the model will be removed from memory.
This issue can be resolved by adding these events to the tensor filter.
By using SUSPEND and RESUME events, we can unload or reload models from memory in sub-plugins, except for core functions.
If the sub-plugin cannot handle SUSPEND and RESUME events, it is possible to close the sub-plugin and open it again.
Users can manage suspend and resume event of tensor filter, but they would also need an automatic management feature later.
The text was updated successfully, but these errors were encountered: