Unintended Bias in Toxicity Classification

Here in this proejct, i haev trained a model to predict toxicity in an a comment. Dataset are part of kaggle cometion Jigsaw Unintended Bias in Toxicity Classification.

The Model was built by performing feature engineering like count of capitalization in a sentence , number of unique words, number of question/exclamation mark and some more. Also data cleaning was performed like lowercasing, tokenizing, lemmatizing , mapping contractions, stop words removal, special character removal .

Initially i created models with count and TF-IDF vectors and later i used GloVe and FAst Text Word vectors with sklearn and neural network respectivly. below are thir relative scores:

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Reports		Reports
images		images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reports

Reports

images

images

README.md

README.md

Repository files navigation

Unintended Bias in Toxicity Classification

About

Releases

Packages

Languages

DataDrivenGit/Unintended-Bias-in-Toxicity-Classification

Folders and files

Latest commit

History

Repository files navigation

Unintended Bias in Toxicity Classification

About

Resources

Stars

Watchers

Forks

Languages