Login | Register

Discourse Analysis of Argumentative Essays of English Learners based on their CEFR Level

Title:

Discourse Analysis of Argumentative Essays of English Learners based on their CEFR Level

Hanel, Blaise (2023) Discourse Analysis of Argumentative Essays of English Learners based on their CEFR Level. Masters thesis, Concordia University.

[thumbnail of Hanel_MCompSc_F2023.pdf]
Preview
Text (application/pdf)
Hanel_MCompSc_F2023.pdf - Accepted Version
Available under License Spectrum Terms of Access.
1MB

Abstract

This thesis aims to explore the relationship between discourse information and the CEFR-level (Common European Framework of Reference for Languages) in argumentative English learner essays. The study leverages two prominent frameworks: the Rhetorical Structure Theory (RST) and the Penn Discourse TreeBank (PDTB), to analyze essays obtained from The International Corpus Network of Asian Learners (ICNALE) and the Corpus and Repository of Writing (CROW). The research investigates the influence of different discourse relations and connectives on the language proficiency level of the writers, and further explores the potential of using discourse information as additional features for automated CEFR-level determination. The analysis of the collected essays reveals significant findings regarding the utilization of discourse relations by English learners. Notably, the RST relations of EXPLANATION and BACKGROUND are statistically used more often by writers with a CEFR level below fluency. In addition, as the CEFR level increases, the use of the PDTB relation of CONTINGENCY decreases. These results provide empirical evidence of the relationship between discourse relations and language proficiency, highlighting the differential usage patterns among learners at various CEFR levels. To validate these findings computationally, discourse relations and connectives are employed as supplementary features for machine learning models. The experimental results indicate that incorporating discourse information into the automated CEFR-level determination process leads to a mild increase in performance compared to relying solely on lexical and grammatical features. However, it is important to note that the proposed approach does not outperform the use of large language models, such as RoBERTa, which have demonstrated superior performance in various natural language processing tasks.
Nevertheless, this study contributes valuable insights into the relationship between discourse relations and argumentative English learner essays. The findings highlight the potential influence of discourse relations on language proficiency and suggest avenues for further research and development in language assessment methodologies.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Hanel, Blaise
Institution:Concordia University
Degree Name:M. Comp. Sc.
Program:Computer Science
Date:20 July 2023
Thesis Supervisor(s):Kosseim, Leila
ID Code:992554
Deposited By: Blaise Hanel
Deposited On:20 Nov 2023 15:38
Last Modified:20 Nov 2023 15:38
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top