Kaggle Competition : 'Santander Customer Transation Prediction'

My Final Submission for the 'Santander Customer Transaction Prediction':

In this repo, I assamble some of the work I did during an interesting ( and a very tough) Kaggle competition.

Here is the official link of the competition on Kaggle..

It was a true learning experience for me to participate in the challenge. What mad it a special competition is the number of talented and smart participants for all over the world. I would particularly mention all the Kaggle masters and grand-masters that were driving the challenge to higher levels and providing ideas and hints all along the way.

Here is a part of the description provided by the competition hosts:

"... In this challenge, we invite Kagglers to help us identify which customers will make a specific transaction in the future, irrespective of the amount of money transacted. The data provided for this competition has the same structure as the real data we have available to solve this problem. "

1-Exploratory Data Analysis Notebook

In this notebook, I tried to go though the data and see if I could notice a certain pattern or an interesting trend. It was one of the most interesting phases of this competition because:

The data was 'clean' : no heavy work was required to put the variables into shape.
The data was synthetic : the data set was not a real world production data, but it was generated by an algorithm to simulate the behavior of customers and to be as close as possible to the actual ‘Santander’ customer data.
Almost every participant was stuck at a certain performance threshold : it was very hard to enhance the model beyond a certain performance point.

2-LightGBM model with Data Augmentation

I have experimented with various models and technics, But the model that had the highest performance point was the LightGBM. Stacking and bleing was also a huge part of the top 1% winning solutions. However, I have tried to keep it simple and to get though it step by step and understand how the data is behaving after passing through each different model.

One of the 'magic' ideas that were discussed in the competition forum was feature engineering and especially data augmentation. Other feature engineering ideas were applied, such as creating 100s of new variables as a blend of existing variables, and doing all possible and imaginable combinations.

3-Other ideas that did not work (Work in progress)

Here, I will try to assamble all (most) of the ideas that I have tried but did not work.

It was mostly different models (XGBoost, Regressions, Basic Neural Network models... ect.)

Info:

I could not share the competition data due to the competition rules. The competition host requires an explicit acceptance of the competition rules by the user before having access to the data set. To be able to get the competition data, you should have a kaggle account, access the competition page and agree on the competition rules.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
Other_Scripts		Other_Scripts
EDA_Santander_Customer_Transation_Prediction_Wajdi_V011.ipynb		EDA_Santander_Customer_Transation_Prediction_Wajdi_V011.ipynb
README.md		README.md
Wajdi_Santander_Customer_Transation_Prediction_Code_V09.ipynb		Wajdi_Santander_Customer_Transation_Prediction_Code_V09.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Other_Scripts

Other_Scripts

EDA_Santander_Customer_Transation_Prediction_Wajdi_V011.ipynb

EDA_Santander_Customer_Transation_Prediction_Wajdi_V011.ipynb

README.md

README.md

Wajdi_Santander_Customer_Transation_Prediction_Code_V09.ipynb

Wajdi_Santander_Customer_Transation_Prediction_Code_V09.ipynb

Repository files navigation

Kaggle Competition : 'Santander Customer Transation Prediction'

Here is a part of the description provided by the competition hosts:

Info:

About

Releases

Packages

Languages

WajdiBenSaad/Kaggle_Customer_Transation_Prediction

Folders and files

Latest commit

History

Repository files navigation

Kaggle Competition : 'Santander Customer Transation Prediction'

Here is a part of the description provided by the competition hosts:

Info:

About

Topics

Resources

Stars

Watchers

Forks

Languages