GitHub - farhanarrafi/empirical-analysis-cardiovascular-data: This is a class project that analyzes cardiovascular data using empirical statistics.

CardioVascular Data Analysis - An Empirical Study Using Statistical Methods

Introduction

This project aims to provide a detailed analysis of Patient Data and their relation to different stages of Hypertension. In this study, we have analyzed the relation of a subject’s weight, systolic blood pressure, and diastolic blood pressure to the Hypertension stage of the subject. In this project, we have only used Hypertension Stage 1 and Hypertension Stage 2 as target categories.

Important: To run the notebook you will have to get an API Token from Kaggle. Follow the instructions to run the notebook:

Instructions

Create a new API Token.
Download the API Token and open the (downloaded) JSON file as a text file.
Copy the key from the JSON file and replace the following line:

{"username":"farhanarrafi","key":"get_a_key_from_kaggle_to_run_the_notebook"}

Run the notebook.

Variables Used in this Analysis

Weight - The weight of the subject in kilogram.
Systolic Pressure - The maximum blood pressure during contraction of the ventricles.
Diastolic Pressure - The minimum blood pressure recorded just before the next contraction.

Exploratory Data Analysis

Analysis

Regression line of Weight

Regression using 2 variables Systolic and Diastolic Blood Pressure

Regression using 3 variables Weight, Systolic, and Diastolic Blood Pressure

Results

Using only Systolic BP and Diastolic BP provides better predictions than using all three - Weight, Systolic BP, and Diastolic BP.

For more details, you can check the final presentation.

Dataset Source

For this project, we have collected the data from the Kaggle dataset - “Cardiovascular Disease by Aidan”. As per the information provided, this data represents consolidated data from two sources:

UCI Machine Learning Repository - Heart Disease Dataset
Kaggle - Heart Disease Dataset by YasserH

In the original dataset, there are about 68000 rows of data. However, to keep our analysis simple we have preserved the 2000 rows using random selection and discarded the rest of the data.

Statistical Methods used:

Hypothesis Testing
Proportion, Mean, Standard Deviation, Variance Analysis
Correlation between variables
Univariate and Multivariate Regression
Determination Coefficient $R^2$ Analysis

Contribution

In this project, three other people also worked with different combinations of variables.

Sushant Thapa - You can check their work on other variables here.
Harika Prathipati - You can check their work on other variables here.
Lokesh Mylavarpu

References

Patricia S. Abril and Robert Plant, 2007. The patent holder's dilemma: Buy, sell, or troll? Commun. ACM 50, 1 (Jan, 2007), 36-44. DOI: https://doi.org/10.1145/1188913.1188915.
Clinical Methods: The History, Physical, and Laboratory Examinations. 3rd edition, Walker HK, Hall WD, Hurst JW, editors. Boston: Butterworths; 1990, Chapter 16.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
images		images
.DS_Store		.DS_Store
.gitignore		.gitignore
Emperical_Analysis_Project.ipynb		Emperical_Analysis_Project.ipynb
Final_Presentation.pdf		Final_Presentation.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

.DS_Store

.DS_Store

.gitignore

.gitignore

Emperical_Analysis_Project.ipynb

Emperical_Analysis_Project.ipynb

Final_Presentation.pdf

Final_Presentation.pdf

README.md

README.md

Repository files navigation

CardioVascular Data Analysis - An Empirical Study Using Statistical Methods

Introduction

Instructions

Variables Used in this Analysis

Exploratory Data Analysis

Analysis

Regression line of Weight

Regression using 2 variables Systolic and Diastolic Blood Pressure

Regression using 3 variables Weight, Systolic, and Diastolic Blood Pressure

Results

Dataset Source

Statistical Methods used:

Contribution

References

About

Languages

farhanarrafi/empirical-analysis-cardiovascular-data

Folders and files

Latest commit

History

Repository files navigation

CardioVascular Data Analysis - An Empirical Study Using Statistical Methods

Introduction

Instructions

Variables Used in this Analysis

Exploratory Data Analysis

Analysis

Regression line of Weight

Regression using 2 variables Systolic and Diastolic Blood Pressure

Regression using 3 variables Weight, Systolic, and Diastolic Blood Pressure

Results

Dataset Source

Statistical Methods used:

Contribution

References

About

Topics

Resources

Stars

Watchers

Forks

Languages