Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform checks for anomalous values #248

Draft
wants to merge 17 commits into
base: main
Choose a base branch
from
Draft

Conversation

rich-iannone
Copy link
Member

@rich-iannone rich-iannone commented Dec 19, 2020

This PR introduces the col_anomaly_check() function. The goal is to catch anomalous values and signal the number of these anomalies as failing test units. This is also meant to work with all common database backends by using a combination of modeling on the user system and applying model predictions to the remote table via joins.

To be done:

  • provide rows with anomalies as downloadable CSV in agent report
  • generate ggplot artifacts that show how anomalies are detected (needs further feature for provision of supplemental materials)
  • add more time granularities for time-window size aside from those that are currently offered
  • deal with gaps in time-series data and window sizes that have low amounts of data points
  • allow x data that is not date-time data
  • allow for univariate anomaly detection
  • allow user to choose a formula for mgcv::gam()

Fixes: #246

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Anomaly detection in table values
1 participant