Skip to content

This is a Regression problem to predict Bulldozer price based on given features using Scikit-learn. Data is downloaded from Kaggle Dataset. Achieved RMSLE (Root Mean Squared Log Error) is on Training dataset is 0.12935673426211658

Notifications You must be signed in to change notification settings

jinalgoyani/Bulldozer_Regression_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Bulldozer_Regression_Project

Predicting the sales price of bulldozers using Machine Learning.

1. Problem Definition

How well can we predict the future sale price of a bulldozer given its characteristics and previous record of sales.(Regression Problem).

2. Data The data is used from kaggle's blue book for bulldozers, link: https://www.kaggle.com/c/bluebook-for-bulldozers/data

There are three main datasets:

Train.csv is the training set, which contains data through the end of 2011. Valid.csv is the validation set, which contains data from January 1, 2012 - April 30, 2012 You make predictions on this set throughout the majority of the competition. Your score on this set is used to create the public leaderboard. Test.csv is the test set, which won't be released until the last week of the competition. It contains data from May 1, 2012 - November 2012. Your score on the test set determines your final rank for the competition.

3. Evaluation The evaluation metric for this competition is the RMSLE (root mean squared log error) between the actual and predicted auction prices. For more on the evaluation check: https://www.kaggle.com/c/bluebook-for-bulldozers/overview/evaluation

Note: Our goal for this project will be to build a machine learning model which minimises the RMSLE.

4. Features Kaggle provides the data dictionary describing all the features of a dataset. You can view this data dictionary here- https://www.kaggle.com/competitions/bluebook-for-bulldozers/data

About

This is a Regression problem to predict Bulldozer price based on given features using Scikit-learn. Data is downloaded from Kaggle Dataset. Achieved RMSLE (Root Mean Squared Log Error) is on Training dataset is 0.12935673426211658

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published