Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add standardised country code #43

Open
MatMoore opened this issue Apr 5, 2020 · 2 comments
Open

Add standardised country code #43

MatMoore opened this issue Apr 5, 2020 · 2 comments
Assignees
Labels
enhancement New feature or request JH CSSE Data from John Hopkins Centre for Systems Science and Engineering

Comments

@MatMoore
Copy link
Collaborator

MatMoore commented Apr 5, 2020

Forking this discussion from #42

Problem 1: Country identifiers are inconsistent

Currently, the country_region field is not a very reliable identifier.

One example is the UK, which has been reported in at least two different ways:

Currently, we ignore the change in column name but not the change in column value.

Problem 2: There is no way to consistently refer to the scope of the data

Originally most data was country level, except for China, which was reported in different regions.

This is getting more detailed as time goes on, so it means that if you make the same API call, you can get different information back depending on the date range you are querying.

For the UK, there is now additional rows with country_region=United Kingdom, but province_state set. To get just the UK, you need to filter extra rows that include British overseas territories and crown dependencies.

For the US, they have started reporting county-level information instead of country-level.

Proposal

  1. We use this lookup table ISO alpha2/3 country codes to each row, and make them filterable, so that users can filter by country more reliably. We should make this a new column for backwards compatibility.

  2. We add a new column called something like "scope", with values country_region, province_state and admin2. We can infer the scope of each record by looking at which columns are filled in. The code is sort of doing this already to check uniqueness, but the result is not stored. This way, API users should be able to write a query with scope and country parameters set and always get a consistent response back.

@MatMoore MatMoore added the JH CSSE Data from John Hopkins Centre for Systems Science and Engineering label Apr 5, 2020
@MatMoore
Copy link
Collaborator Author

MatMoore commented Apr 5, 2020

I could start working on this this week, as long as there's nothing more urgent that needs doing for the MVP?

@andreagrandi
Copy link
Owner

I could start working on this this week, as long as there's nothing more urgent that needs doing for the MVP?

Looking at #20 I think the only missing thing for the MVP is to improve the documentation and add a few usage examples. I will take care of this today, so feel free to start working on #43

As a next step, I would like to start working on "Italian Protezione Civile" source of data.

@andreagrandi andreagrandi added the enhancement New feature or request label Apr 5, 2020
@MatMoore MatMoore self-assigned this Apr 6, 2020
MatMoore added a commit that referenced this issue Apr 18, 2020
This makes better region information available, for example ISO country codes,
and allows the user to filter based on them.

This addresses #43

Example:

    http://127.0.0.1:8000/v1/jh/daily-reports/?country_code_iso2=GB

      {
        "id": 100477,
        "country_region": "United Kingdom",
        "province_state": "Channel Islands",
        "fips": null,
        "admin2": null,
        "last_update": "2020-04-04T23:34:00",
        "confirmed": 262,
        "deaths": 5,
        "recovered": 13,
        "region_info": {
          "uid": 8261,
          "scope": "Province_State",
          "country_code_iso2": "GB",
          "country_code_iso3": "GBR",
          "country_region": "United Kingdom",
          "province_state": "Channel Islands",
          "fips": null,
          "admin2": null
        }
      }

This example includes province/state level reports as well as country/region level. The scope parameter
can be used to filter these out:

    http://127.0.0.1:8000/v1/jh/daily-reports/?country_code_iso2=GB&scope=Country_Region

      {
        "id": 100709,
        "country_region": "United Kingdom",
        "province_state": null,
        "fips": null,
        "admin2": null,
        "last_update": "2020-04-04T23:34:00",
        "confirmed": 41903,
        "deaths": 4313,
        "recovered": 135,
        "region_info": {
          "uid": 826,
          "scope": "Country_Region",
          "country_code_iso2": "GB",
          "country_code_iso3": "GBR",
          "country_region": "United Kingdom",
          "province_state": null,
          "fips": null,
          "admin2": null
        }
      },

The country_region, province_state, fips, admin2 fields duplicate the top level one, but I think we should
include them for backwards compatability. Then if we create a V2 of the API we can drop the top level ones.

The ones in region info are the cleaned versions - they ensure that the same region will always be presented
the same way. The top level ones come from the reports themselves and are inconsistent from day to day.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request JH CSSE Data from John Hopkins Centre for Systems Science and Engineering
Projects
None yet
Development

No branches or pull requests

2 participants