Goal is to build a spam e-mail classifier based on Logistic Regression. Second part deals with adding an L2 regularization to the objective function. Refer this for the complete problem statement.
Spambase Dataset The spambase dataset consists of continuous variables/features extracted from email data like the frequency of certain words and characters.
Dataset files included in the folder: spambase-train.csv spambase-test.csv
See this for the results and analysis.
Goal is to classify the SMS messages as either SPAM or HAM. Refer this for the complete problem statement.
Dataset files included in the folder: SMSSpamCollection
See this for the results and analysis.
Refer this for the complete problem statement.
Goal is to implement 1-Nearest Neighbor algorithm and Cross-Fold Validation and analyze how the classification error varies with different number of training examples.
Fetch MNIST dataset using the script here
Goal is to analyze the robustness of the classifier with varying outliers.
Dataset file included in the folder: iris.csv
See this for the results and analysis.
The base code was provided by the instructor.