Skip to content

milanowicz/COVID-19-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

COVID-19 Dataset

This COVID-19 Dataset should be used for Data Sciene. Therefore the columns are the same for JHU and RKI data to load them with pandas.

Case numbers Germany from Robert Koch-Institut (RKI) in Germany

Description of columns:

StateDateConfirmedDeaths
Name of federal state (German Bundesland) Date in %Y-%m-%d format Numbers of confirmed cases Numbers of deaths

COVID-19-RKI

Case numbers from Johns Hopkins University (JHU) for the World

COVID-19-JHU

Data by day

Description of columns:

Data: data/jhu/time_series_covid19_confirmed_deaths_recovered.csv
CityStateCountryDateLatitudeLongitudeConfirmedDeathsRecoveredActiveWHO Region
Name from City Name of federal state Name from Country Date in %Y-%m-%d format Latitude Longitude Numbers of confirmed cases Numbers of deaths Numbers of recovered Active = Confirmed - Deaths - Recovered WHO Region

Grouped by Day and Country

Data: data/jhu/time_series_covid19_grouped_day_country.csv
DateCountryConfirmedDeathsRecoveredActiveNew casesNew deathsNew recoveredWHO Region
Date in %Y-%m-%d format Name from Country Numbers of confirmed cases Numbers of deaths Numbers of recovered Active = Confirmed - Deaths - Recovered New cases / Day New deaths / Day New recovered / Day WHO Region

Grouped by all Countries together

Data: data/jhu/time_series_covid19_grouped_by_countries.csv

Columns:

    Country
    Confirmed
    Deaths
    Recovered
    Active
    New cases
    New deaths
    New recovered
    Deaths / 100 Cases
    Recovered / 100 Cases
    Deaths / 100 Recovered
    Confirmed last week
    1 week change
    1 week % increase
    WHO Region

Grouped by all Days together

Data: data/jhu/time_series_covid19_grouped_by_days.csv

Columns:

    Date
    Confirmed
    Deaths
    Recovered
    Active
    New cases
    New deaths
    New recovered
    Deaths / 100 Cases
    Recovered / 100 Cases
    Deaths / 100 Recovered
    Country Number

Common data description

Population CSV files

The dataset contains population data of different countries/regions from 1960 to 2018. There are condensed and region-wise data in the population dataset.

Origin: https://data.worldbank.org/indicator/SP.POP.TOTL

Kaggle Competion

Install Python environment

Create environment and install Python libs for a GNU/Linux operation system:

$ . env.sh
$ pip3 install pandas urllib shutil wget

Update Dataset

Update data only

$ . update.sh

Update, commit and push data

$ . aupdate.sh

or manually

$ python get_data.py