This dataset includes data from various countries around the world from year 2000 to 2015 and the data includes mortality, economic and social factors for each country for each year. This project focuses on determing the predictors actually impacting the life expectancy using various Multivariate Linear Regression analysis techniques along with statistical techniques such as hypothesis testing and determing influential points or outliers unduly affecting the target Life expectancy.
Methods used in this project includes analysis by interpreting Scatterplots and correlation with heatmap visualization, Stepwise Linear Regression, Full vs. Reduced Models, Hypothesis testing, p-value significance, normality test, detecting and removing multicollinearity, using proven statiscal techniques such DFFITS & DFBETAS to remove influential points or outliers from dataset.