Jointly predicting probability of default and finding missing values:

This is the code of our team for the first KU Leuven Datathon, where we were the 3rd best Master's team.

Our approach was mainly to directly model the default probability of companies.

Jointly predicting probability of default and finding missing values:

We do not use the categorical country columns that do not seem needed given the continuous ones.

We derived and implemented an EM-kind of algorithm: REPEAT: 1) In one step, we train our neural network given all data.
2) In the other step, we fill in our data given our model. The columns we fill are the POLICY ones, as well as the women percentages.

This gave us a nice insight on the missing policy columns: In comparison to the other missing data procedures, it filled most missing rows as 'not having a policy'; => This is more in line with our intuition that if the company doesn't specifically have a policy, that they don't publish it.

Our best neural network is a two-layer model, that has around half the error of logistic regression on the test set.

The predictions seem sensible, with the residuals not showing any specific pattern.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
Credit		Credit
Cycling		Cycling
__MACOSX		__MACOSX
extra_data		extra_data
EM.ipynb		EM.ipynb
EM_data.csv		EM_data.csv
EM_imputation_analysis.R		EM_imputation_analysis.R
Model_Trees.R		Model_Trees.R
Python starter pack.ipynb		Python starter pack.ipynb
README.md		README.md
anomalydata.csv		anomalydata.csv
exploratory.R		exploratory.R
extra_data.R		extra_data.R
final.pptx		final.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jointly predicting probability of default and finding missing values:

cluster analysis

About

Releases

Packages

Contributors 5

Languages

Bird-cult/datathon

Folders and files

Latest commit

History

Repository files navigation

Jointly predicting probability of default and finding missing values:

cluster analysis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages