Skip to content

Latest commit

 

History

History
46 lines (30 loc) · 2.91 KB

README.md

File metadata and controls

46 lines (30 loc) · 2.91 KB

Goal is to build a spam e-mail classifier based on Logistic Regression. Second part deals with adding an L2 regularization to the objective function. Refer this for the complete problem statement.

Dataset:

Spambase Dataset The spambase dataset consists of continuous variables/features extracted from email data like the frequency of certain words and characters.

Dataset files included in the folder: spambase-train.csv spambase-test.csv

See this for the results and analysis.

Goal is to classify the SMS messages as either SPAM or HAM. Refer this for the complete problem statement.

Dataset:

SMS Spam Collection Dataset

Dataset files included in the folder: SMSSpamCollection

See this for the results and analysis.

Refer this for the complete problem statement.

Optical Character Recognition:

Goal is to implement 1-Nearest Neighbor algorithm and Cross-Fold Validation and analyze how the classification error varies with different number of training examples.

Fetch MNIST dataset using the script here

Iris Plant Recognition

Goal is to analyze the robustness of the classifier with varying outliers.

Dataset file included in the folder: iris.csv

See this for the results and analysis.

Acknowledgements

The base code was provided by the instructor.