- The code should run with no issues using Python versions 3.*.
- No extra besides the built-in libraries from Anaconda needed to run this project
- numpy
- pandas
- seaborn
- glob
- os
For this project, I was interestested in using Stack Overflow survey data from 2011 to 2018 to analyse the Data Science community growth. This project contains the analysis of proportion & trend in data science community growth in various countries, industries and different sized companies across the globe. And, Data science community consists of 'Database Administrator', 'Business intelligence expert', Data warehousing expert', 'Machine learning specialist', 'Data Scientist' and 'Developer with a statistics or mathematics background'. Below are few of the questions to insights in which I was interested -
- What is the trend in Data Science community growth from 2011 to 2018?
- In which countries has the Data Science community grown?
- What is the trend in Data Science community growth in various countries over the years?
- In which Industries has the Data Science community grown and in what proportion?
- What is the trend in Data Science community growth in various industries over the years?
- In which companies(small, medium & large) has the Data Science community grown and in what proportion?
- What is the trend in Data Science community growth in various different sized companies over the years?
- data: Folder contains data files of StackOverflow developer survey data, following name conventions of "YYYY_survey_data.zip" which contains .csv files with name conventions of ""YYYY_survey_data.csv"
- schema: Folder contains description of the columns of the survey data files for years 2017-2018, following name conventions of "YYYY_survey_schema.csv"
- Analysis of Data Science community growth from 2011-2018.ipynb: Jupyter Notebook used for all the analysis and to showcase work related to the above questions. Markdown cells were used for explanation and to assist in walking through the thought process.
- Project Notebook: Analysis of Data Science community growth from 2011-2018
- Blog Post: Where do Data Science experts exists?
Must give credit to Stack Overflow for the data. You can find the Licensing for the data and other descriptive information at the following link available here.