You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think it might be quite useful to track dependencies in such a way that e.g. if I want to deploy a new version of artifact foo to prod and foo depends on artifacts bar and baz then GTO will warn me if bar and baz are not yet in prod.
Reasoning:
In a scenario with coupled data and model versions (see here for my reasons for why that might be a good idea) it would be nice to be able to link them together explicitly.
I can look at the dvc pipeline of my foo model in the foo_training repository and see that it depends on foo_data, rev:v0.1.0. I can even add this info to the annotation of foo_model in GTO in the model registry (so that I don't have to keep going back to the model to check). But with this workflow, I would like to make sure I cannot deploy a model to production if it depends on a dataset which is not yet in production (i.e. if it is produced by a data pipeline which does not run in production and consequently, the integration of the model service and data preparation service will fail).
So a cool feature would be if GTO gives me a warning (or prevents me unless I use something like gto assign --force) when I am assigning prod (or any specified reserved stage name) to the model, unless its dataset dependency is also in prod.
The text was updated successfully, but these errors were encountered:
I think it might be quite useful to track dependencies in such a way that e.g. if I want to deploy a new version of artifact
foo
toprod
and foo depends on artifactsbar
andbaz
then GTO will warn me ifbar
andbaz
are not yet inprod
.Reasoning:
In a scenario with coupled data and model versions (see here for my reasons for why that might be a good idea) it would be nice to be able to link them together explicitly.
I can look at the dvc pipeline of my
foo
model in thefoo_training
repository and see that it depends onfoo_data, rev:v0.1.0
. I can even add this info to the annotation offoo_model
in GTO in the model registry (so that I don't have to keep going back to the model to check). But with this workflow, I would like to make sure I cannot deploy a model to production if it depends on a dataset which is not yet in production (i.e. if it is produced by a data pipeline which does not run in production and consequently, the integration of the model service and data preparation service will fail).So a cool feature would be if GTO gives me a warning (or prevents me unless I use something like
gto assign --force
) when I am assigningprod
(or any specified reserved stage name) to the model, unless its dataset dependency is also inprod
.The text was updated successfully, but these errors were encountered: