-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-trained model weights #2
Comments
Hi @nahidalam, |
I was thinking ImageNet or any other image dataset since my goal is to get image embedding. But then I realized ViS4mer is for understanding video so I am not sure if my request makes sense. Nonetheless wrt LVU dataset - scene/place, relationship, way of speaking - these three tasks are most relevant for me. |
Yes, ViS4mer is a video understanding model. However, technically you can use it for image modeling too. Anyway, I will try to release the pretrained weights for the scene/place, relationship, way of speaking tasks. |
Thank you for quick replay. Yes I understand it is technically possible to get embedding. I was thinking more from "meaning" perspective for my particular usecase. Maybe a model on COIN dataset makes better sense. I think a recipe on how to train ViS4mer for custom video dataset will be good so people like me can try on their own instead of taking your time. Happy to collaborate if you plan to do that sometime. |
Hi
Will you publish any pre-trained model? Preferably in torchHub? I was thinking of using ViS4mer for extracting image embedding.
The text was updated successfully, but these errors were encountered: