Skip to content

Latest commit

 

History

History
37 lines (30 loc) · 2.08 KB

README.md

File metadata and controls

37 lines (30 loc) · 2.08 KB

Combination Robust Cut Forests

CodeFactor PyPI version codecov

Isolation Forests [Liu+2008] and Robust Random Cut Trees [Guha+2016] are very similar in many ways, as outlined in the supporting overview. Most notably, they are extremes of the same outlier scoring function:

$$\theta \textrm{Depth} + (1 - \theta) \textrm{[Co]Disp}$$

The combination robust cut forest allows you to combine both scores by using an theta other than 0 or 1.

Install

You can install with through pip install crcf. Alternatively, you can download the repository and run python3 setup.py install or pip3 install . Please note that this package uses features from Python 3.7+ and is not compatible with earlier Python versions.

Tasks

  • complete basic implementation
  • provide clear documentation and usage instructions
  • ensure interface allows for fitting and scoring on multiple points at the same time
  • implement a better saving method than pickling
  • use random tests with hypothesis
  • implement tree down in cython
  • accelerate forests with multi-threading
  • incorporate categorical variable support, including categorical rules
  • complete the write-up document with a benchmarking of performance

References