Skip to content

Implementation of statistics used for paper soon to be cited here

License

Notifications You must be signed in to change notification settings

ricoms/gpam_stats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DOI

gpam_stats

Implementation of statistics used for paper: OKIMOTO, L. C. ; SAVII, R. M. ; LORENA, A. C. . Complexity Measures Effectiveness in Feature Selection. In: Brazilian Conference on Intelligent Systems (BRACIS), 2017, Uberlândia. IEEE Proceedings of the 2017 Brazilian Conference on Intelligent Systems (BRACIS), 2017. v. 1. p. 1-6.

This work was done in collaboration with Lucas Chesini Okimoto and Ana Carolina Lorena from ICT-UNIFESP, São José dos Campos, SP - Brazil.

Project organization

This project is organized as below ::

ArtificialDataset
├── dataFelipe1.csv
└── dataFelipe2.csv
logs
└── ArtificialDataset.log
outputs
└── run_ArtificialDataset.csv
scripts
├── __initi__.py
├── run_all.py
└── stats.py
LICENSE
README.md
requirements.txt

..

The main code is kept inside scripts/stats.py where all statistics mentioned in the article are implemented, this code is intended to be turned into a ptyhon library for a later use. The scripts/run_all.py is just a helper script to run all statistics on datasets kept in a specific folder, you can edit this file to test on your datasets.

The dataset used for this example is an artificial dataset available for public use if necessary. Inside folds logs and outputs are the results of running the run_all.py script over both datasets inside ArtificialDataset folder. The log will give you the time it took to calculate each estatistic for each dataset, and outputs/ will contain a .csv file with each statistic value for each dataset.

We also provide a requirements.txt file for fast installation of packages required to run this project. We used Python version 3.5.2 to implement and run this code.