Skip to content

Commit

Permalink
Changed some default parameters
Browse files Browse the repository at this point in the history
  • Loading branch information
andrewdalpino committed Nov 29, 2018
1 parent 4675a2c commit 0943e37
Show file tree
Hide file tree
Showing 26 changed files with 92 additions and 79 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@
- Added Texture Histogram descriptor
- Added Average Color descriptor
- Removed parameters from Dropout and Alpha Dropout layers
- Added option to remove biases in Dense and Placeholder layers
- Added option to remove biases in Dense and Placeholder1D layers
- Optimized Dataset objects
- Optimized matrix and vector operations
- Added grid params to Param helper
Expand Down
46 changes: 25 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Rubix ML is a high-level machine learning library that lets you build programs that learn from data using the [PHP](https://php.net) language.

- Fast and easy prototyping with user-friendly API
- 40+ modern *supervised* and *unsupervised* learners
- 40+ modern Supervised and Unsupervised learners
- Modular architecture combines power and flexbility
- Open source and free to use commercially

Expand All @@ -31,13 +31,17 @@ MIT

### Table of Contents

- [Basic Introduction](#basic-introduction)
- [Obtaining Data](#obtaining-data)
- [Choosing an Estimator](#choosing-an-estimator)
- [Training and Prediction](#training-and-prediction)
- [Evaluation](#evaluating-model-performance)
- [Visualization](#visualization)
- [Next Steps](#next-steps)
- [Basic Introduction](#basic-introduction)
- [Obtaining Data](#obtaining-data)
- [Choosing an Estimator](#choosing-an-estimator)
- [Training and Prediction](#training-and-prediction)
- [Evaluation](#evaluating-model-performance)
- [Visualization](#visualization)
- [Next Steps](#next-steps)
- Example Projects
- [Credit Card Default Predictor](https://github.com/RubixML/Credit)
- [Human Activity Recognizer](https://github.com/RubixML/HAR)
- [Housing Price Predictor](https://github.com/RubixML/Housing)
- [API Reference](#api-reference)
- [Dataset Objects](#dataset-objects)
- [Labeled](#labeled)
Expand Down Expand Up @@ -140,7 +144,7 @@ MIT
- [Xavier 2](#xavier-2)
- [Layers](#layers)
- [Input Layers](#input-layers)
- [Placeholder](#placeholder)
- [Placeholder 1D](#placeholder-1d)
- [Hidden Layers](#hidden-layers)
- [Activation](#activation)
- [Alpha Dropout](#alpha-dropout)
Expand Down Expand Up @@ -1429,7 +1433,7 @@ A type of linear classifier that uses the logistic (*sigmoid*) function to disti
#### Parameters:
| # | Param | Default | Type | Description |
|--|--|--|--|--|
| 1 | batch size | 50 | int | The number of training samples to process at a time. |
| 1 | batch size | 100 | int | The number of training samples to process at a time. |
| 2 | optimizer | Adam | object | The gradient descent optimizer used to train the underlying network. |
| 3 | alpha | 1e-4 | float | The amount of L2 regularization to apply to the weights of the network. |
| 4 | epochs | 1000 | int | The maximum number of training epochs to execute. |
Expand Down Expand Up @@ -1583,7 +1587,7 @@ A generalization of [Logistic Regression](#logistic-regression) for multiclass p
#### Parameters:
| # | Param | Default | Type | Description |
|--|--|--|--|--|
| 1 | batch size | 50 | int | The number of training samples to process at a time. |
| 1 | batch size | 100 | int | The number of training samples to process at a time. |
| 2 | optimizer | Adam | object | The gradient descent optimizer used to train the underlying network. |
| 3 | alpha | 1e-4 | float | The amount of L2 regularization to apply to the weights of the network. |
| 4 | epochs | 1000 | int | The maximum number of training epochs to execute. |
Expand Down Expand Up @@ -1810,14 +1814,14 @@ T-distributed Stochastic Neighbor Embedding is a two-stage non-linear manifold l
#### Parameters:
| # | Param | Default | Type | Description |
|--|--|--|--|--|
| 1 | dimensions | 2 | int | The number of dimensions to embed the data into. |
| 1 | dimensions | 2 | int | The number of dimensions of the target embedding. |
| 2 | perplexity | 30 | int | The number of effective nearest neighbors to refer to when computing the variance of the Gaussian over that sample. |
| 3 | exaggeration | 12. | float | The factor to exaggerate the distances between samples during the early stage of fitting. |
| 4 | rate | 100. | float | The learning rate that controls the step size. |
| 5 | epochs | 1000 | int | The number of times to iterate over the embedding. |
| 6 | min gradient | 1e-7 | float | The minimum gradient necessary to continue embedding. |
| 7 | window | 5 | int | The training window to consider during early stop checking i.e. the last n epochs. |
| 8 | kernel | Euclidean | object | The distance kernel to use when measuring distances between samples. |
| 6 | min gradient | 1e-8 | float | The minimum gradient necessary to continue embedding. |
| 7 | window | 3 | int | The number of most recent epochs to consider when determining an early stop. |
| 7 | kernel | Euclidean | object | The distance kernel to use when measuring distances between samples. |

#### Additional Methods:

Expand Down Expand Up @@ -1846,7 +1850,7 @@ Adaptive Linear Neuron or (*Adaline*) is a type of single layer [neural network]
#### Parameters:
| # | Param | Default | Type | Description |
|--|--|--|--|--|
| 1 | batch size | 50 | int | The number of training samples to process at a time. |
| 1 | batch size | 100 | int | The number of training samples to process at a time. |
| 2 | optimizer | Adam | object | The gradient descent optimizer used to train the underlying network. |
| 3 | alpha | 1e-4 | float | The amount of L2 regularization to apply to the weights of the network. |
| 4 | epochs | 100 | int | The maximum number of training epochs to execute. |
Expand Down Expand Up @@ -3122,19 +3126,19 @@ There are three types of layers that form a network, **Input**, **Hidden**, and
### Input Layers
The entry point for data into a neural network is the input layer which is the first layer in the network. Input layers do not have any learnable parameters.

### Placeholder
The Placeholder input layer serves to represent the *future* input values of a mini batch to the network.
### Placeholder 1D
The Placeholder 1D input layer represents the *future* input values of a mini batch (matrix) of single dimensional tensors (vectors) to the neural network.

#### Parameters:
| # | Param | Default | Type | Description |
|--|--|--|--|--|
| 1 | inputs | None | int | The number of inputs to the network. |
| 1 | inputs | None | int | The number of inputs to the neural network. |

#### Example:
```php
use Rubix\ML\NeuralNet\Layers\Placeholder;
use Rubix\ML\NeuralNet\Layers\Placeholder1D;

$layer = new Placeholder(100);
$layer = new Placeholder1D(100);
```

### Hidden Layers
Expand Down
7 changes: 4 additions & 3 deletions composer.json
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
{
"name": "rubix/ml",
"type": "library",
"description": "Rubix ML is a machine learning library that lets you build programs that learn from data in PHP.",
"description": "Rubix ML is a high-level machine learning library that lets you build programs that learn from dats using the PHP language.",
"homepage": "https://github.com/RubixML/RubixML",
"license": "MIT",
"keywords": [
"php", "machine-learning", "data-science", "data-mining", "predictive-modeling", "classification",
"regression", "clustering", "anomaly-detection", "neural-network", "manifold-learning",
"dimensionality-reduction", "artificial-intelligence", "ai", "cross-validation", "feature-extraction"
"dimensionality-reduction", "artificial-intelligence", "ai", "cross-validation", "feature-extraction",
"deep-learning", "rubix", "ml"
],
"authors": [
{
Expand All @@ -21,7 +22,7 @@
"php": ">=7.1.3",
"intervention/image": "^2.4",
"psr/log": "^1.0",
"rubix/tensor": "dev-master"
"rubix/tensor": "^1.0.2"
},
"require-dev": {
"ext-gd": "*",
Expand Down
2 changes: 1 addition & 1 deletion src/AnomalyDetectors/IsolationForest.php
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ public function train(Dataset $dataset) : void

$this->forest = [];

for ($epoch = 0; $epoch < $this->estimators; $epoch++) {
for ($epoch = 1; $epoch <= $this->estimators; $epoch++) {
$tree = new ITree($maxDepth);

$subset = $dataset->randomize()->head($p);
Expand Down
6 changes: 3 additions & 3 deletions src/Classifiers/LogisticRegression.php
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
use Rubix\ML\NeuralNet\Layers\Binary;
use Rubix\ML\Other\Traits\LoggerAware;
use Rubix\ML\NeuralNet\Optimizers\Adam;
use Rubix\ML\NeuralNet\Layers\Placeholder;
use Rubix\ML\NeuralNet\Layers\Placeholder1D;
use Rubix\ML\NeuralNet\Optimizers\Optimizer;
use Rubix\ML\NeuralNet\CostFunctions\CrossEntropy;
use Rubix\ML\NeuralNet\CostFunctions\CostFunction;
Expand Down Expand Up @@ -116,7 +116,7 @@ class LogisticRegression implements Online, Probabilistic, Verbose, Persistable
* @throws \InvalidArgumentException
* @return void
*/
public function __construct(int $batchSize = 50, ?Optimizer $optimizer = null, float $alpha = 1e-4,
public function __construct(int $batchSize = 100, ?Optimizer $optimizer = null, float $alpha = 1e-4,
int $epochs = 1000, float $minChange = 1e-4, ?CostFunction $costFn = null)
{
if ($batchSize < 1) {
Expand Down Expand Up @@ -200,7 +200,7 @@ public function train(Dataset $dataset) : void
$this->classes = $dataset->possibleOutcomes();

$this->network = new FeedForward(
new Placeholder($dataset->numColumns()),
new Placeholder1D($dataset->numColumns()),
[],
new Binary($this->classes, $this->alpha, $this->costFn),
$this->optimizer
Expand Down
4 changes: 2 additions & 2 deletions src/Classifiers/MultiLayerPerceptron.php
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
use Rubix\ML\Other\Traits\LoggerAware;
use Rubix\ML\NeuralNet\Optimizers\Adam;
use Rubix\ML\NeuralNet\Layers\Multiclass;
use Rubix\ML\NeuralNet\Layers\Placeholder;
use Rubix\ML\NeuralNet\Layers\Placeholder1D;
use Rubix\ML\NeuralNet\Optimizers\Optimizer;
use Rubix\ML\CrossValidation\Metrics\Metric;
use Rubix\ML\CrossValidation\Metrics\Accuracy;
Expand Down Expand Up @@ -287,7 +287,7 @@ public function train(Dataset $dataset) : void
$this->classes = $dataset->possibleOutcomes();

$this->network = new FeedForward(
new Placeholder($dataset->numColumns()),
new Placeholder1D($dataset->numColumns()),
$this->hidden,
new Multiclass($this->classes, $this->alpha, $this->costFn),
$this->optimizer
Expand Down
6 changes: 3 additions & 3 deletions src/Classifiers/SoftmaxClassifier.php
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
use Rubix\ML\Other\Traits\LoggerAware;
use Rubix\ML\NeuralNet\Optimizers\Adam;
use Rubix\ML\NeuralNet\Layers\Multiclass;
use Rubix\ML\NeuralNet\Layers\Placeholder;
use Rubix\ML\NeuralNet\Layers\Placeholder1D;
use Rubix\ML\NeuralNet\Optimizers\Optimizer;
use Rubix\ML\NeuralNet\CostFunctions\CostFunction;
use Rubix\ML\NeuralNet\CostFunctions\CrossEntropy;
Expand Down Expand Up @@ -117,7 +117,7 @@ class SoftmaxClassifier implements Online, Probabilistic, Verbose, Persistable
* @throws \InvalidArgumentException
* @return void
*/
public function __construct(int $batchSize = 50, ?Optimizer $optimizer = null, float $alpha = 1e-4,
public function __construct(int $batchSize = 100, ?Optimizer $optimizer = null, float $alpha = 1e-4,
int $epochs = 1000, float $minChange = 1e-4, ?CostFunction $costFn = null)
{
if ($batchSize < 1) {
Expand Down Expand Up @@ -201,7 +201,7 @@ public function train(Dataset $dataset) : void
$this->classes = $dataset->possibleOutcomes();

$this->network = new FeedForward(
new Placeholder($dataset->numColumns()),
new Placeholder1D($dataset->numColumns()),
[],
new Multiclass($this->classes, $this->alpha, $this->costFn),
$this->optimizer
Expand Down
2 changes: 1 addition & 1 deletion src/Clusterers/DBSCAN.php
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
* > **Note**: Noise samples are assigned the cluster number *-1*.
*
* References:
* [1] M. Ester et al. (1996). A Densty-Based Algorithmfor Discovering Clusters.
* [1] M. Ester et al. (1996). A Densty-Based Algorithm for Discovering Clusters.
*
* @category Machine Learning
* @package Rubix/ML
Expand Down
8 changes: 5 additions & 3 deletions src/Kernels/Distance/Canberra.php
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,11 @@ public function compute(array $a, array $b) : float
{
$distance = 0.;

foreach ($a as $i => $coordinate) {
$distance += abs($coordinate - $b[$i])
/ ((abs($coordinate) + abs($b[$i])) ?: self::EPSILON);
foreach ($a as $i => $valueA) {
$valueB = $b[$i];

$distance += abs($valueA - $valueB)
/ ((abs($valueA) + abs($valueB)) ?: self::EPSILON);
}

return $distance;
Expand Down
8 changes: 4 additions & 4 deletions src/Kernels/Distance/Diagonal.php
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,12 @@ class Diagonal implements Distance
*/
public function compute(array $a, array $b) : float
{
$distances = [];
$deltas = [];

foreach ($a as $i => $coordinate) {
$distances[] = abs($coordinate - $b[$i]);
foreach ($a as $i => $value) {
$deltas[] = abs($value - $b[$i]);
}

return max($distances);
return max($deltas);
}
}
4 changes: 2 additions & 2 deletions src/Kernels/Distance/Euclidean.php
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ public function compute(array $a, array $b) : float
{
$distance = 0.;

foreach ($a as $i => $coordinate) {
$distance += ($coordinate - $b[$i]) ** 2;
foreach ($a as $i => $value) {
$distance += ($value - $b[$i]) ** 2;
}

return sqrt($distance);
Expand Down
4 changes: 2 additions & 2 deletions src/Kernels/Distance/Hamming.php
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@ public function compute(array $a, array $b) : float

$distance = 0;

foreach ($a as $i => $coordinate) {
if ($coordinate !== $b[$i]) {
foreach ($a as $i => $value) {
if ($value !== $b[$i]) {
$distance++;
}
}
Expand Down
8 changes: 5 additions & 3 deletions src/Kernels/Distance/Jaccard.php
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,11 @@ public function compute(array $a, array $b) : float
{
$distance = $mins = $maxs = 0.;

foreach ($a as $i => $coordinate) {
$mins += min($coordinate, $b[$i]);
$maxs += max($coordinate, $b[$i]);
foreach ($a as $i => $valueA) {
$valueB = $b[$i];

$mins += min($valueA, $valueB);
$maxs += max($valueA, $valueB);
}

return 1. - ($mins / ($maxs ?: self::EPSILON));
Expand Down
4 changes: 2 additions & 2 deletions src/Kernels/Distance/Manhattan.php
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ public function compute(array $a, array $b) : float
{
$distance = 0.;

foreach ($a as $i => $coordinate) {
$distance += abs($coordinate - $b[$i]);
foreach ($a as $i => $value) {
$distance += abs($value - $b[$i]);
}

return $distance;
Expand Down
4 changes: 2 additions & 2 deletions src/Kernels/Distance/Minkowski.php
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,8 @@ public function compute(array $a, array $b) : float
{
$distance = 0.;

foreach ($a as $i => $coordinate) {
$distance += abs($coordinate - $b[$i]) ** $this->lambda;
foreach ($a as $i => $value) {
$distance += abs($value - $b[$i]) ** $this->lambda;
}

return $distance ** $this->inverse;
Expand Down
6 changes: 3 additions & 3 deletions src/Manifold/TSNE.php
Original file line number Diff line number Diff line change
Expand Up @@ -115,8 +115,8 @@ class TSNE implements Estimator, Verbose
protected $minGradient;

/**
* The training window to consider during early stop checking i.e. the last
* n epochs.
* The number of most recent epochs to consider when determining an early
* stop.
*
* @var int
*/
Expand Down Expand Up @@ -152,7 +152,7 @@ class TSNE implements Estimator, Verbose
* @return void
*/
public function __construct(int $dimensions = 2, int $perplexity = 30, float $exaggeration = 12.,
float $rate = 100., int $epochs = 1000, float $minGradient = 1e-7, int $window = 5,
float $rate = 100., int $epochs = 1000, float $minGradient = 1e-8, int $window = 3,
?Distance $kernel = null)
{
if ($dimensions < 1) {
Expand Down
4 changes: 2 additions & 2 deletions src/NeuralNet/FeedForward.php
Original file line number Diff line number Diff line change
Expand Up @@ -73,10 +73,10 @@ public function __construct(Input $input, array $hidden, Output $output, Optimiz

$this->layers[] = $output;

$this->initialize();

$this->optimizer = $optimizer;
$this->backPass = array_reverse($this->hidden());

$this->initialize();
}

/**
Expand Down
2 changes: 1 addition & 1 deletion src/NeuralNet/Layers/Binary.php
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ public function __construct(array $classes, float $alpha = 1e-4, ?CostFunction $
$costFunction = new CrossEntropy();
}

$this->classes = [$classes[0] => 0, $classes[1] => 1];
$this->classes = array_flip(array_values($classes));
$this->alpha = $alpha;
$this->costFunction = $costFunction;
$this->initializer = new Xavier1();
Expand Down
6 changes: 5 additions & 1 deletion src/NeuralNet/Layers/Multiclass.php
Original file line number Diff line number Diff line change
Expand Up @@ -222,9 +222,13 @@ public function back(array $labels, Optimizer $optimizer) : array
$expected = [];

foreach ($this->classes as $i => $class) {
$joint = [];

foreach ($labels as $label) {
$expected[$i][] = $class === $label ? 1. : 0.;
$joint[] = $class === $label ? 1. : 0.;
}

$expected[] = $joint;
}

$expected = Matrix::quick($expected);
Expand Down
Loading

0 comments on commit 0943e37

Please sign in to comment.