Skip to content

Code for the thesis "A Corpus-Based Case Analysis on Syntactic Complexity in Russian ESL Learners’ Writing".

Notifications You must be signed in to change notification settings

Aniezka/syntactic-complexity

Repository files navigation

Thesis, bachelor's programme Fundamental and Computational Linguistics

Measuring the complexity of a learner text is considered to be a significant factor in assessing the level of foreign language proficiency. We aim to study syntactic complexity (SC), which is usually interpreted as the variety and degree of complexity of the syntactic structures that are present in a text.

The research was carried out based on 984 learner texts written in English by Russian speakers, which were collected in the corpus REALEC (Kuzmenko & Kututzov, 2014). Each text has a grade given by independent experts and information on the number of 7 types of syntactic errors identified by annotators.

This study examines methods of SC evaluation via automated tools for analysis of SC: TAASSC (Kyle, 2016), L2SCA (Lu, 2010), and Inspector (Lyashevskaya et al., 2021). It has not yet been established which SC constructions or errors in their use are often found among Russian learners of English. We hypothesize that there is a correlation between the level of language proficiency and the number of syntactic errors and values of SC parameters. Hence, the objective of our study is to answer the research questions: Which parameters of SC most accurately reflect the level of English proficiency among Russian speakers? How can we explain the results of SC evaluation? Is there a correlation between the level of language proficiency and the number of syntactic errors and SC? For the analysis we used rank correlation coefficients.

Consequently, the SC parameters of learner texts which correlate most with the essay grade or the number of syntactic errors were identified. We can’t report a strong correlation (the maximum value of Spearman’s correlation coefficient is 0.439). The correlation between the SC parameters and the number of syntactic errors was found to be much weaker than the correlation between the same parameters and the grade.

About

Code for the thesis "A Corpus-Based Case Analysis on Syntactic Complexity in Russian ESL Learners’ Writing".

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published