This repository contains an exploratory data analysis (EDA) project focused on roller coasters. The project involved organizing, cleaning, and visualizing the data to gain insights into roller coasters' characteristics and performance.
The data was obtained from a publicly available dataset that includes information about roller coasters, such as their name, location, manufacturer, height, speed, and duration. The project utilized Python libraries such as Pandas, NumPy, Matplotlib.PyPlot, and seaborn to organize, clean, and visualize the data.
The project started with data cleaning, which involved handling missing values, removing duplicates, and correcting inconsistencies in the data. Once the data was cleaned, the project focused on visualizing the data using various plots, such as scatter plots, bar plots, and histograms.
The project also included asking and answering questions about the data, such as which roller coaster has the highest drop height, which manufacturer produces the fastest roller coasters, and which countries have the most roller coasters.
Overall, this project serves as a valuable resource for anyone interested in learning about EDA and data visualization. The roller coaster dataset provides an interesting and engaging subject for practicing data analysis techniques, and the Python libraries used in this project are widely used in data science and provide a solid foundation for anyone interested in pursuing a career in data analysis.