Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schedule retrain an existing prophet model #2556

Open
AyushBhardwaj321 opened this issue Feb 20, 2024 · 6 comments
Open

Schedule retrain an existing prophet model #2556

AyushBhardwaj321 opened this issue Feb 20, 2024 · 6 comments

Comments

@AyushBhardwaj321
Copy link

I am using mlflow to log the prophet model let's name as model1. I am able to log the model, versioning is also happening, and prediction also working fine.
Now i got data for next 1 week since my model was trained, So i want to train my model which should have understanding of previous data / historical data based on which model1 was trained as well as on new data also.
One way of doing that collab all data and train it which in time consuming.

since i am having trained model of prophet:
Is there a way to retrain the model it weekly with new weekly data?
using mlflow i am able to log it and load it, but I don't know how to retrain it with new data.

@imad24
Copy link
Contributor

imad24 commented Feb 21, 2024

Are you sure it is worth it to log the Prophet model in the first place ?
"Training" the Prophet model actually takes less <1s especially if you set the interval_width=0 or mcmc_samples=0

IMO, just retrain the model from scratch every time you have a new data point. No need to use MLFlow or any model life cycle management system.

@AyushBhardwaj321
Copy link
Author

@imad24 Thanks for replying.
over here why i want to re-train new model based on previous trained model because the dataset will over a million for the first time which we can say historical data, training over that much amount of data it took > 1 sec. thats why i want to utilized the previously trained model to get new model in which is having features of previously trained model and technically i might take less time then trained whole model from scratch

@imad24
Copy link
Contributor

imad24 commented Feb 21, 2024

@imad24 Thanks for replying. over here why i want to re-train new model based on previous trained model because the dataset will over a million for the first time which we can say historical data, training over that much amount of data it took > 1 sec. thats why i want to utilized the previously trained model to get new model in which is having features of previously trained model and technically i might take less time then trained whole model from scratch

I'm not sure I understand. Prophet is not exactly like other "classical" supervised ML models.
When you train a Prophet model, you do it on a single univariate time series.
It doesn't support global modeling, where you can have one model for multiple time series.
So what do you mean by

the dataset will over a million for the first time

@AyushBhardwaj321
Copy link
Author

@imad24 Thanks for active response. Really i appreciate that.
What I want to tell is that i am having a big data in which i am having 2.5 millions of data
which looks like

                                ds    y
0        2020-10-21 12:57:47+00:00  0.0
1        2020-10-21 12:57:48+00:00  0.0
2        2020-10-21 12:57:49+00:00  0.0
3        2020-10-21 12:57:50+00:00  0.0
4        2020-10-21 12:57:51+00:00  0.0
...                            ...  ...
2591996  2020-11-20 12:57:43+00:00  7.0
2591997  2020-11-20 12:57:44+00:00  7.0
2591998  2020-11-20 12:57:45+00:00  7.0
2591999  2020-11-20 12:57:46+00:00  6.0
2592000  2020-11-20 12:57:47+00:00  6.0

[2592001 rows x 2 columns]

I want to train my first model based on the historical data as the only issue is that it takes a long time. Now once after the model created lets name it as Model_v1 and i am having real-time data source from which i am getting data suppose every minute, so I want to re-train the model on weekly or monthly basis., If i have to train from scratch it will again take alot of time. So, i want to use Model_v1 which is having historical pattern to use that as base model and fit new values on top of that and create new model lets call it Model_v2.

@imad24
Copy link
Contributor

imad24 commented Feb 21, 2024

@imad24 Thanks for active response. Really i appreciate that. What I want to tell is that i am having a big data in which i am having 2.5 millions of data which looks like

                                ds    y
0        2020-10-21 12:57:47+00:00  0.0
1        2020-10-21 12:57:48+00:00  0.0
2        2020-10-21 12:57:49+00:00  0.0
3        2020-10-21 12:57:50+00:00  0.0
4        2020-10-21 12:57:51+00:00  0.0
...                            ...  ...
2591996  2020-11-20 12:57:43+00:00  7.0
2591997  2020-11-20 12:57:44+00:00  7.0
2591998  2020-11-20 12:57:45+00:00  7.0
2591999  2020-11-20 12:57:46+00:00  6.0
2592000  2020-11-20 12:57:47+00:00  6.0

[2592001 rows x 2 columns]

I want to train my first model based on the historical data as the only issue is that it takes a long time. Now once after the model created lets name it as Model_v1 and i am having real-time data source from which i am getting data suppose every minute, so I want to re-train the model on weekly or monthly basis., If i have to train from scratch it will again take alot of time. So, i want to use Model_v1 which is having historical pattern to use that as base model and fit new values on top of that and create new model lets call it Model_v2.

Oh I see now you're working with high frequency time series (secondly in this case).
In this case you're right, it does make sense to use warm start training.

Have you tried this approach explained in the documentation ?

@AyushBhardwaj321
Copy link
Author

@imad24 Hi, Thanks alot for quick reply.
Yes i have looked into the above mention solution in the link, and i tried it out as well but as i mentioned earlier i'm using prophet with mlflow to manage end-to-end model lifecycle. I am able to log the model with airflow , But when i am loading the model, I am unable to use the function mentioned in this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants