Panagiotopoulos, Alexandra (2015) An investigation of tense, aspect and other verb group features for English proficiency assessment on different Asian learner corpora. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
476kBPanagiotopoulos_MCompSc_S2015.pdf - Accepted Version |
Abstract
Recent interest in second language acquisition has resulted in studying the relationship between
linguistic indices and writing proficiency in English. This thesis investigates the influence of basic
linguistic notions, introduced early in English grammar, on automatic proficiency evaluation tasks.
We discuss the predictive potential of verb features (tense, aspect, voice, type and degree of em-
bedding) and compare them to word level n-grams (unigrams, bigrams, trigrams) for proficiency
assessment. We conducted four experiments using standard language corpora that differ in authors’
cultural backgrounds and essay topic variety. Tense showed little variation across proficiency lev-
els or language of origin making it a bad predictor for our corpora, but tense and aspect showed
promise, especially for more natural and varied datasets. Overall, our experiments illustrated that
verb features, when examined individually, form a baseline for writing proficiency prediction. Feature combinations, however, perform better for these verb features, which are grammatically not
independent. Finally, we investigate how language homogeneity due to corpus design influences the
performance of our features. We find that the majority of the essays we examined use present tense,
indefinite aspect and passive voice, thus greatly limiting the discriminative power of tense, aspect,
and voice features. Thus linguistic features have to be tested for their interoperability together with
their effectiveness on the corpora used. We conclude that all corpus-based research should include
an early validation step that investigates feature independence, feature interoperability, and feature
value distribution in a reference corpus to anticipate potentially spurious data sparsity effects.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Panagiotopoulos, Alexandra |
Institution: | Concordia University |
Degree Name: | M. Comp. Sc. |
Program: | Computer Science |
Date: | April 2015 |
Thesis Supervisor(s): | Bergler, Sabine |
ID Code: | 979857 |
Deposited By: | ALEXANDRA PANAGIOTOPOULOS |
Deposited On: | 13 Jul 2015 15:53 |
Last Modified: | 18 Jul 2019 15:09 |
Repository Staff Only: item control page