This portfolio is a collection of notebooks which I created for data analysis and machine learning projects.
Implementing automated reporting infrastructure has enormous potential and benefits. It can be applied to all parts of a business, from marketing and sales to fulfillment and customer services. Automated reports will significantly decrease the timeline, combine data from different resources, and create reports within minutes. Some advantages associated with the automated reporting approach:
- Decreased expenses
- Reduced time
- Greater control and consistency
- Decreased risk associated with human/manual error
- Accurate and reliable results
Sales interactive dashboard help us to drill-down the sales data, quickly identify trends, monitor the changes, and make timely strategic decisions. App is available on this link
Interactive dashboards empower users to gain valuable insight into key metrics and make data-driven decisions. Interactivity helps optimize the use of dashboard space and updates visualizations automatically as the user changes inputs. Flexdashboard is an R markdown file, which can be either static or dynamic. By combining flexdashboard with Shiny, you can write dynamic web applications without any knowledge of HTML, CSS, or JavaScript, using only R and R markdown.
Detailed post and brief tutorial is avaialble on my Medium
App is available on the following link (under development)
- Choose Independent Variables
- Train the Model
- Visualize Predictions (Actual numbers vs Predicted numbers)
- Visualize Residuals
- Download Predictions
For this project I use Medicare Provider Utilization and Payment Data for more than 3,000 U.S. hospitals that receive Medicare paments. I explore and visualize data to make comparisons between the individual hospital-level charges and payments within local markets, and nationwide. Questions I want to answer
- Which Diagnostic Related Groups cost Medicare the most?
- What are the most commong hospital discharges?
- What is the trend in last 3 years?
- Which States and Hospitals charge the most? etc.
For more details code can be found here GitHub Flavored Markdown.
The data was scraped from the wikipedia. The World Happiness Report is an annual publication of the United Nations Sustainable Development Solutions Network. Happiness is explained by 6 factors. As per the 2019 Happiness Index, Finland is the happiest country in the world. The interactive version of map (Plotly) can be found here.
For more details code can be found here GitHub Flavored Markdown.
The main purpose of this project is to build supervised machine learning models with h2o and experiment with hyperparameters. (Python) For more details code can be found here GitHub Flavored Markdown.
Interactive data visualization enhances exploratory data analysis and is a great way to engage with both technical and non-technical audiences. R's leaflet package is a powerful tool to create visually compelling interactive maps. In this post I will show how to create a choropleth map with leaflet. Choropleth maps show the level of variability within a region, using color.
I will build a choropleth map using data for the 2019 Novel Coronavirus published by Johns Hopkins University Center for Systems Science and Engineering. To make it easy to follow through the steps, we will work with state-level data.
Some models are easy to interpret such as linear / logistic regression (weight on each feature, knowing the exact contribution and negative and positive interaction), single decision trees, some models are harder to interpret such as Ensemble models - it is hard to understand the role of each feature, it comes with "feature importance" but does not tell if feature affects decision positively or negatively