Login | Register

Fully Bayesian Inference for Finite and Infinite Discrete Exponential Mixture Models

Title:

Fully Bayesian Inference for Finite and Infinite Discrete Exponential Mixture Models

su, xuanbo (2021) Fully Bayesian Inference for Finite and Infinite Discrete Exponential Mixture Models. Masters thesis, Concordia University.

[thumbnail of Xuanbo_MASc_F2021.pdf]
Preview
Text (application/pdf)
Xuanbo_MASc_F2021.pdf - Accepted Version
704kB

Abstract

Count data often appears in natural language processing and computer vision applications. For example, in images and textual documents clustering, each image or text can be described
by a histogram of visual words or text words. In real applications, these frequency vectors often show high-dimensional and sparsity nature. In this case, hierarchical Bayesian modeling frameworks
show the ability to model the dependence of the word repetitive occurrences ’burstiness’.
Moreover, approximating these models to exponential families is helpful to improve computing efficiency, especially when facing high-dimensional count data and large data sets. However, classical deterministic approaches such as expectation-maximization (EM) do not achieve good results in real-life complex applications. This thesis explores the use of a fully Bayesian inference for finite discrete exponential mixture models of Multinomial Generalized Dirichlet (EMGD), Multinomial
Beta-Liouville (EMBL), Multinomial Scaled Dirichlet (EMSD), and Multinomial Shifted Scaled Dirichlet (EMSSD). Finite mixtures have already shown superior performance in real data
sets clustering with EM approach. The proposed approaches in this thesis are based on Monte Carlo simulation technique of Gibbs sampling mixed with Metropolis-Hastings step, and we utilize exponential family conjugate prior information to construct the required posteriors relying on Bayesian theory. Furthermore, we also present the infinite models based on Dirichlet processes, which results in clustering algorithms that do not require the specification of the number of mixture components to be given in advance. The performance of our Bayesian approaches was tested in some challenging real-world applications concerning text sentiment analysis, fake news detection, and human face
gender recognition.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:Thesis (Masters)
Authors:su, xuanbo
Institution:Concordia University
Degree Name:M.A. Sc.
Program:Information and Systems Engineering
Date:5 November 2021
Thesis Supervisor(s):Bouguila, Nizar and Zamzami, Nuha
ID Code:990003
Deposited By: Xuanbo Su
Deposited On:16 Jun 2022 15:15
Last Modified:16 Jun 2022 15:15
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top