Logistic Regression Classifier

Goal is to build a spam e-mail classifier based on Logistic Regression. Second part deals with adding an L2 regularization to the objective function. Refer this for the complete problem statement.

Dataset:

Spambase Dataset The spambase dataset consists of continuous variables/features extracted from email data like the frequency of certain words and characters.

Dataset files included in the folder: spambase-train.csv spambase-test.csv

See this for the results and analysis.

Naive Bayes Classification

Goal is to classify the SMS messages as either SPAM or HAM. Refer this for the complete problem statement.

Dataset:

SMS Spam Collection Dataset

Dataset files included in the folder: SMSSpamCollection

See this for the results and analysis.

K Nearest Neighbors

Refer this for the complete problem statement.

Optical Character Recognition:

Goal is to implement 1-Nearest Neighbor algorithm and Cross-Fold Validation and analyze how the classification error varies with different number of training examples.

Fetch MNIST dataset using the script here

Iris Plant Recognition

Goal is to analyze the robustness of the classifier with varying outliers.

Dataset file included in the folder: iris.csv

See this for the results and analysis.

Acknowledgements

The base code was provided by the instructor.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Logistic Regression Classifier

Dataset:

Naive Bayes Classification

Dataset:

K Nearest Neighbors

Optical Character Recognition:

Iris Plant Recognition

Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

Logistic Regression Classifier

Dataset:

Naive Bayes Classification

Dataset:

K Nearest Neighbors

Optical Character Recognition:

Iris Plant Recognition

Acknowledgements