Automate data updates in production #521

ivan-aksamentov · 2020-04-12T17:00:39Z

🙋 Feature Request

We want to update the data daily, however manual updates are very time-consuming and error-prone.

🔦 Context

😯 Describe the feature

We need to find a way to automate the data updates in production, while also performing some basic sanity checks. It is desirable for a human to review the update before it goes live.

This flow should be untied from the general release cycle.

Data should be updated consistently across all long-living branches (master, release, production), so that everyone is on the same page.

💻 Examples

💁 Possible Solution

implement single-step build for the data updates so that it can be run in CI environment
initiate data update build step daily, using a GitHub action (on staging branch)
the bot automatically creates a branch and open pull request against staging branch, containing the new data
maintainer reviews the PR, as well as the results of automatic checks and the deployed version of the application
maintainer merges the PR, possibly adding more commits into it, or closes
PR is created against master branch and automatically merged if possible
if not, maintainer resolves conflicts in the master PR and merges
maintainer releases the data changes by fast-forwarding release branch to staging
eventually, as we are confident in the reliability of checks, the merge to staging can be automatic

If a new country is added:
- is there an appropriate header
- is it placed in correct folder
- Is it in the right format?
- does the data roughly make sense (monotonically rising for respective features, continuous, etc)
If only new data is added:
- Is the new data a 'continuation' of old data
  - Are cases, deaths, recovered >= previous values (might flag number revisions, how to handle?)
  - Are there huge jumps in the 'current'-type values (hospitalized, ICU)
If old data is changed:
- Can we see why old data was changed
  - new column added?
  - old data adjusted by few values due to correction at source?

ivan-aksamentov added this to the 1.2 milestone Apr 17, 2020

ivan-aksamentov mentioned this issue Apr 17, 2020

🚀 1.2 #581

Closed

36 tasks

noleti referenced this issue Apr 19, 2020

new france/France.tsv

1041b9b

ivan-aksamentov removed this from the 1.2 milestone May 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automate data updates in production #521

Automate data updates in production #521

ivan-aksamentov commented Apr 12, 2020

noleti commented Apr 13, 2020 •

edited

Automate data updates in production #521

Automate data updates in production #521

Comments

ivan-aksamentov commented Apr 12, 2020

🙋 Feature Request

🔦 Context

😯 Describe the feature

💻 Examples

💁 Possible Solution

Related

noleti commented Apr 13, 2020 • edited

noleti commented Apr 13, 2020 •

edited