Shareghi Nojehdeh, Ehsan (2013) Feature Combination for Measuring Sentence Similarity. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
446kBShareghi-Nojehdeh_MCompSc_S2013.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Sentence similarity is one of the core elements of
Natural Language Processing (NLP) tasks such as
Recognizing Textual Entailment, and Paraphrase Recognition.
Over the years, different systems have been proposed to
measure similarity between fragments of texts. In this
research, we propose a new two phase supervised learning
method which uses a combination of lexical features to
train a model for predicting similarity between sentences.
Each of these features, covers an aspect of the text on
implicit or explicit level. The two phase method uses all
combinations of the features in the feature space and trains
separate models based on each combination. Then it creates a
meta-feature space and trains a final model based on that.
The thesis contrasts existing approaches that use feature
selection, because it does not aim to find the best subset of
the possible features. We show that this two step process
significantly improves the results achieved by single-layer
standard learning methodology, and achieves the level of
performance that is comparable to the existing state-of-the-art
methods.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Shareghi Nojehdeh, Ehsan |
Institution: | Concordia University |
Degree Name: | M. Comp. Sc. |
Program: | Computer Science |
Date: | 10 April 2013 |
ID Code: | 977146 |
Deposited By: | EHSAN SHAREGHI NOJEHDEH |
Deposited On: | 13 Jun 2013 20:23 |
Last Modified: | 18 Jan 2018 17:43 |
Repository Staff Only: item control page