Login | Register

Harmonizing Divergence in Computational Discourse Analysis

Title:

Harmonizing Divergence in Computational Discourse Analysis

Costa, Nelson Filipe (2025) Harmonizing Divergence in Computational Discourse Analysis. PhD thesis, Concordia University.

[thumbnail of Costa_PhD_S2026.pdf]
Preview
Text (application/pdf)
Costa_PhD_S2026.pdf - Accepted Version
Available under License Spectrum Terms of Access.
705kB

Abstract

Understanding discourse is essential for advancing computational models from surface-level text processing to deeper language reasoning, as it captures the logical flow of ideas that shapes meaning into a coherent text. However, progress in computational discourse analysis is hindered by divergent theoretical frameworks, ambiguity in implicit discourse relations and a myopic focus on the English language.

This thesis addresses these challenges through three research objectives. First, it proposes an empirical mapping between the two most widely used discourse frameworks, the Rhetorical Structure Theory and the Penn Discourse Treebank, for explicit and implicit discourse relations. The proposed mapping successfully maps 80.0% of the overlapping annotations between the most prominent corpora following each framework, laying groundwork for cross-framework interoperability. Second, the thesis introduces a novel multi-task classification model, MTask, for Implicit Discourse Relation Recognition (IDRR). The model captures ambiguity in implicit relations by jointly learning multi-label representations of their senses. The model establishes the first benchmark on multi-label IDRR and is also evaluated on the traditional single-label IDRR. Third, the thesis extends the multi-label approach to different languages and presents a hierarchical classification model. The model outperforms MTask in the English language and establishes the first benchmark on multilingual and multi-label IDRR. The thesis further explores prompting strategies using recent large language models and shows that fine-tuning strategies still perform better in this task.

Together, these contributions advance the goal of harmonizing divergence in computational discourse analysis, offering more generalizable and inclusive methods for discourse modeling across frameworks, ambiguity and languages.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (PhD)
Authors:Costa, Nelson Filipe
Institution:Concordia University
Degree Name:Ph. D.
Program:Computer Science
Date:21 August 2025
Thesis Supervisor(s):Kosseim, Leila
ID Code:996415
Deposited By: Nelson Filipe Ferreira De Almeida Costa
Deposited On:29 Jun 2026 15:33
Last Modified:29 Jun 2026 15:33
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top