Daghyani, Masoud (2019) Efficient Computation of Log-likelihood Function in Clustering Overdispersed Count Data. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
4MBDaghyani_MASc_F2019.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
In this work, we present an overdispersed count data clustering algorithm, which uses the mesh method for computing the log-likelihood function, of the multinomial Dirichlet, multinomial generalized Dirichlet, and multinomial Beta-Liouville distributions. Count data are often used in many areas such as information retrieval, data mining, and computer vision. The multinomial Dirichlet distribution (MDD) is one of the widely used methods of modeling multi-categorical count data with overdispersion. In recent works, the use of the mesh algorithm, which involves the approximation of the multinomial Dirichlet distribution's (MDD) log-likelihood function, based on the Bernoulli polynomials; has been proposed instead of using the traditional numerical computation of the log-likelihood function which either results in instability, or leads to long run times that make its use infeasible when modeling large-scale data. Therefore, we extend the mesh algorithm approach for computing the log likelihood function of more flexible distributions, namely multinomial generalized Dirichlet (MGD) and multinomial Beta-Liouville (MBL). A finite mixture model based on these distributions, is optimized by expectation maximization, and attempts to achieve a high accuracy for count data clustering. Through a set of experiments, the proposed approach shows its merits in two real-world clustering problems, that concern natural scenes categorization and facial expression recognition.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Daghyani, Masoud |
Institution: | Concordia University |
Degree Name: | M.A. Sc. |
Program: | Electrical and Computer Engineering |
Date: | 14 August 2019 |
Thesis Supervisor(s): | Nizar, Bouguila |
ID Code: | 985696 |
Deposited By: | Masoud Daghyani |
Deposited On: | 05 Feb 2020 14:18 |
Last Modified: | 05 Feb 2020 14:18 |
Repository Staff Only: item control page