German Hatespeech Recognition

Description

In this project we build a pipeline of existing well trained Neuronal Networks (NN) in order to detect hatespeech in german language. We will show both our methods how to combine these NN and the evaluation of the model that emerges from that.

For hatespeech detection we use a pipeline of :

googletrans to translate german comments into English
detoxify for English hatespeech detection, that predicts the probability (value between 0 and 1) to which the comment belongs to each of the seven categories: : toxicity, severe toxicity, obscene, identity attack, insult, threat and sexual explicit.

Furthermore, we handled the problem of language specific hatspeech by adding a function that checks if the text sequence contains an element of a list of German swear words Schimpfwortliste .

How to use

Use Is_hatespeech.is_hatespeech(query) from the folder "HateSeech_-Erkennung" to apply hatespeech detection model on text sequence (in German). Use Is_Hatespeech.is_hatespeech.is_HardRules(query) to check if the query is contained in the list of German swear words . There is an example provided in the Is_Hatespeech file to test run the model.

Evaluation

We evaluated our model on a test data set consisting of 30% of the entire data. To be able to compare our results with the original binary labels hatespeech (yes/no) we trained a classifier to translate the vector of seven probabilities into a binary label. We trained the classifier on dataset sampled from existing German hatespeech datasets.Details of how we have chosen the classifier model Linear Discriminant Analysis (LDA) can be found in the evaluation folder. Additionally, we provide a diagram of the sensitivity and specificity in dependency of the threshold.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.vscode		.vscode
Data		Data
Evaluation		Evaluation
HateSpeech-Erkennung		HateSpeech-Erkennung
doc		doc
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

German Hatespeech Recognition

Description

How to use

Evaluation

About

Releases

Packages

Languages

Josiphina/AIceberg

Folders and files

Latest commit

History

Repository files navigation

German Hatespeech Recognition

Description

How to use

Evaluation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages