Skip to content

Collection of tabular datasets for benchmarking ML models

Notifications You must be signed in to change notification settings

anwielts/tabular-data-benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

tabular-data-benchmark

Single point where tabular data can be found for training and benchmarking Machine Learning (ML) models.

This repository consists of two parts

  1. Literature: Links to and summary of books, papers, blog posts etc. where tabular datasets are described.
  2. Data: The links to the actual datasets and util-functionality for convenient loading of theses datasets.

Motivation

Tabular data is a domain where suprisingly simple ML models like decision tree based ones most of the times outperform complex deep learning models. Nevertheless, every now and then a new deep learning inspired idea comes up in research and often yields better rssults on the selected dataset in the paper but falls short performance-wise on other datasets not used in the paper. This reporsitory tries to give easy access to benchmarking datasets so upcoming, new ideas can be tested against a wealth of datasets.

How to use it

WIP

Installation

WIP

Tests

WIP

About

Collection of tabular datasets for benchmarking ML models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published