Major update
AnotherSamWilson
released this
28 Sep 20:31
·
222 commits
to master
since this release
This release improved a number of areas:
- Huge performance improvements, especially if categorical variables were being imputed. These come from not predicting candidate data if we don't need to, using a much faster neighbors search, using numpy internally for indexing instead of pandas, and others.
- Ability to tune parameters of models, and use best parameters for mice.
- Improvements to code layout - got rid of ImputationSchema.
- Raw data is now stored as a numpy array to save space and improve indexing.
- Numpy arrays can be imputed, if you want to avoid pandas.
- Options of multiple build-in mean matching functions.
- Mean matching functions can handle most lightgbm objectives.