Kfold - TimeSeriesSplit? #222

jmrichardson · 2020-09-17T15:48:46Z

Hi, thank you for the great package. I have temporal data and would like to be able to use timeseriessplit cross validation or perhaps kfold (hold the shuffle). Is this possible?

Menelau · 2020-09-18T17:05:22Z

Hello,

Yes it is possible. In the case you can use the TimeSeriesSplit from sklearn to create your training and test split (and possibly validation too) and use these sets manually to train fit the base models & DS methods.

Another alternative is to have the DS method as input to the the cross_val_score function from scikit-learn to automatically compute the result over multiple folds. That functionality however, has a problem that it requires the pool of classifiers to be generated inside the DS method, instead of having a pool that you may already have trained before. That is a limitation of the scikit-learn cloning process, which cannot clone already trained models (See issue #89 ). They already have a plan to solve this issue on future updates.

Menelau closed this as completed Nov 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kfold - TimeSeriesSplit? #222

Kfold - TimeSeriesSplit? #222

jmrichardson commented Sep 17, 2020

Menelau commented Sep 18, 2020

Kfold - TimeSeriesSplit? #222

Kfold - TimeSeriesSplit? #222

Comments

jmrichardson commented Sep 17, 2020

Menelau commented Sep 18, 2020