-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[forecast]: allow for "batched" forecasting #244
Comments
Hey @gofford, thanks for the suggestion. Just to be clear, what you'd like to have is train model1 to predict 1 step ahead and use it to predict the first seven steps ahead, train model2 to predict 8 steps ahead and use it to predict the next seven steps ahead and so on? |
Hey @jmoralez, pretty much! Although I'm not 100% sure that it's only a 1-step ahead forecast. When I've built similar (tree-based) forecasters manually I've trained and validated against the fit to the week as a whole, rather than a single point. I think we're talking about the same thing though. |
Is there any update on this? |
I had forgotten that I had requested this but I would be keen to get an update too. |
@gofford can you provide an example? I'm still not entirely sure on how the models are trained and how the features are updated in each step. |
@jmoralez My models have set this up such that the test set is the size of the desired batch, and models are trained with increasing subsets of input lags. Hopefully a diagram helps. Let's say we want a 4 week forecast at daily resolution, where each model is 1 week of data.
At prediction time, each model is applied to the full dataset. Yielding models that predict over successive sets of future points.
Which, when combined gives you a batched forecast, where each batch is a week of data at daily resolution.
There might be some formal concerns of instability and continuity at the boundary but in my practical experience it's not a major problem. |
@jmoralez does this up? Let me know if you need any more information |
Would it be something like this? import numpy as np
from mlforecast import MLForecast
from sklearn.base import clone
from sklearn.linear_model import LinearRegression
from utilsforecast.data import generate_series
max_horizon = 4 # total horizon
steps_per_model = 2 # number of steps each model predicts recursively
series = generate_series(2)
fcst = MLForecast(
models=[],
freq='D',
lags=[1],
)
X, y = fcst.preprocess(series, max_horizon=max_horizon, return_X_y=True)
base_model = LinearRegression()
# train each model on its corresponding target
# first model is trained on the target at t, second at t+2
models = []
for i in range(0, max_horizon, steps_per_model):
mask = ~np.isnan(y[:, i])
models.append(clone(base_model).fit(X[mask], y[mask, i]))
# each model predicts steps_per_model recursively
# and we update the target with its predictions
preds = []
with fcst.ts._maybe_subset(None):
fcst.ts._predict_setup()
for model in models:
with fcst.ts._backup():
for _ in range(steps_per_model):
new_x = fcst.ts._get_features_for_next_step()
step_preds = model.predict(new_x)
preds.append(step_preds)
fcst.ts._update_y(step_preds)
# result
fcst.make_future_dataframe(h=max_horizon).assign(pred=np.vstack(preds).ravel(order='F')) |
Description
mlforecast
currently allows for recursive single-model forecasts, or direct multi-model forecasts where a single model is trained to predict a particular data point in the horizon.A third option here is a middle ground between recursive and direct. In this option, multiple models can be trained but each is responsible for predicting a "batch" of the forecast horizon recursively. This is useful for long forecasting horizons with high resolution data.
Use case
Consider the case were I wish to predict the next 13 weeks of sales at a daily resolution. My options here are to have either 1 recursive model, which will have degrading performance with increasing horizon due to compounded error, or 91 individual forecasting models where each model trains a particular day ahead. Neither of these are ideal.
I can aggregate to weekly level, but I lose the daily resolution which is important for tracking the impact of things like promotions and daily events.
An alternative model that I have used in the past would be to have 13 models, where each model predicts a sequence of 7 steps (1 week).
The text was updated successfully, but these errors were encountered: