Skip to content

Selected scripts from my Master's thesis on computationally finding the syntactical differences between standard German and the urban vernacular Kiezdeutsch.

License

Notifications You must be signed in to change notification settings

Reem-Alatrash/German-Dialect-Variation

Repository files navigation

Logistic Regression Analysis of Kiezdeutsch

Selected scripts from my Master's thesis. The thesis is a large-scale logistic regression analysis of Kiezdeutsch syntax, which aims to computationally find the syntactical differences between standard German and the urban vernacular Kiezdeutsch.

About

The thesis exploits generalized linear models (GLMs) to learn which syntactic constructions found in Kiezdeutsch are characteristic of it in comparison to standard German. This is done on both the word and the phrase level using part-of-speech (POS) n-grams. The thesis identified several POS n-gram types which support the following phenomena: bare NPs, ADV SVO, and V1. Moreover, significant associations between Kiezdeutsch and POS trigrams with negation were identified. Furthermore, the thesis found limited evidence to show that lack of relative clauses is linked to Kiezdeutsch.

Data

The datasets utilized in this thesis are two comparable corpora of German. The first of these corpora contains spoken dialogs in Kiezdeutsch (KiDKo) while the other corpus contains dialogs in mostly standard German (GRAIN).

The Kiezdeutsch corpus or KiDKo—-from the German KiezDeutsch Korpus—-is not publically available. Therefore, I shall not be providing data samples of it in this repository.

The GRAIN corpus (German RAdio INterviews) can be found here.

About

Selected scripts from my Master's thesis on computationally finding the syntactical differences between standard German and the urban vernacular Kiezdeutsch.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published