Login | Register

Smoothed Probabilistic-based Algorithms for Sparse Data with application to Emotion Recognition and Sentiment Analysis

Title:

Smoothed Probabilistic-based Algorithms for Sparse Data with application to Emotion Recognition and Sentiment Analysis

Najar, Fatma (2022) Smoothed Probabilistic-based Algorithms for Sparse Data with application to Emotion Recognition and Sentiment Analysis. PhD thesis, Concordia University.

[thumbnail of Najar_PhD_S2023.pdf]
Text (application/pdf)
Najar_PhD_S2023.pdf - Accepted Version
Restricted to Repository staff only until 30 April 2024.
Available under License Spectrum Terms of Access.
28MB

Abstract

Humans are able to express more than 10,000 expressions through 43 facial muscles which makes reading faces a significant human skill and a challenge task for Artificial Intelligence (AI) algorithms. Even though much research work has been proposed for the field of sentiment analysis and emotion recognition, it continues to present considerable challenges. In our research, we focus on providing novel emotion recognition and sentiment analysis solutions where we address data
challenges that occur in different modalities: texts, images, and videos. Considering these different multimedia contents, the analysis of data considers the concurrency nature of words in a collection of documents, visual words or proportional features vectors when considering images and videos. This type of data involves several challenges including sparseness, burstiness, correlated features, and high-dimensionality.
In this dissertation, we propose smoothed probabilistic-based approaches to deal with the aforementioned data challenges. First, we introduce the calculation of the exact Fisher information matrix
of the generalized Dirichlet multinomial. Our proposed approach has been adopted for detecting depression in tweets, dialogue-based emotion recognition, and image-based sentiment analysis. Second, we develop different smoothed solutions for handling sparsity, high dimensionality, and burstiness issues such as smoothed Dirichlet multinomial, smoothed Generalized Dirichlet, smoothed Generalized Dirichlet multinomial (SGDM), Taylor approximation to the SGDM, Latent-based smoothed Beta-Liouville, Smoothed Beta-Liouville Emotion Term model, and Smoothed Scaled Dirichlet Relevance Model. These models are based on smoothing count vectors in a smoothed subset of the whole simplex to deal with the problem of sparseness. Moreover, we incorporate a hierarchical generalized Dirichlet prior for sparse multinomial distributions and a Beta-Liouville Naive
Bayes with vocabulary knowledge. These two techniques build up on Bayesian vocabulary knowledge over large discrete domains represented by subsets of feasible outcomes: “observed” and “unobserved” words. In another research work, we consider a sparse topic model for non-exchangeable correlated data over time and present a new interactive distance dependant IBP compound Dirichlet process. We derive a Markov Chain Monte Carlo sampler combined with Metropolis-Hastings algorithm and study its performance on sentiment analysis data.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:Thesis (PhD)
Authors:Najar, Fatma
Institution:Concordia University
Degree Name:Ph. D.
Program:Information and Systems Engineering
Date:19 August 2022
Thesis Supervisor(s):Bouguila, Nizar
ID Code:991235
Deposited By: Fatma Najar
Deposited On:21 Jun 2023 14:24
Last Modified:21 Jun 2023 14:24
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top