New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in documentation for SpotTheDiff detector on wine quality dataset? #780
Comments
Hi @vinyasHarish95, Thanks for pointing out this potential source of confusion. The sentence "it is important that the learned detectors are trained on training data which is held-out from the reference data set" is intended to lend intuition as to how the learned detectors work, rather than an instruction to split data before passing it to these detectors. This is because for the learned detectors the splitting is inherent to the drift detection procedure and is therefore implemented automatically inside the detectors. By contrast data splitting is only relevant to the non-learned detectors in the special case where both a preprocessing function is specified and the preprocessing function has been fit/trained using the same source of reference data. Hence in this special case the practitioner should handle the data splitting themselves. Hope that clears things up. We'll consider whether we can make this clearer in the docs. |
Hi Seldon team, thanks for your great work on this package! I'm using it in my PhD research to understand the impact of different dataset shifts during COVID-19 on a precision public health model.
I was taking a look at the SpotTheDiff detector and the background docs say that "[like pre-processing steps] learned detectors are trained on training data which is held-out from the reference data set".
In the example on the same page, the
PCA
is trained onX_train
and theMMDDrift
detector is instantiated on X_ref.However, in the wine quality example, the detector is instantiated on
X_ref
?So I'm confused if there should be part of the whites dataset (an
X_train
) that should've been set aside to train the detector?Thank you for clarifying.
The text was updated successfully, but these errors were encountered: