Tahsin, Faiza (2025) Novel Probabilistic Frameworks for Author-Level Topic Modeling. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
698kBTahsin_MASc_S2025.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
The increasing complexity of textual data in modern applications, such as social media and academic literature analysis, needs improved topic modeling techniques that capture sparsity, variability, and nuanced author-topic relationships. Because of their rigorous assumptions and inadequate adaptability in representing various data, traditional models generally fail to address these shortcomings. We present two novel probabilistic models, Author Dirichlet Multinomial Allocation with Generalized Distribution (ADMAGD) and Author Beta-Liouville Multinomial Allocation (ABLiMA) to overcome these drawbacks while strengthening the state of author-specific topic modeling. To depict complex author-topic relationships, ADMAGD incorporates the Generalized Dirichlet distribution. For datasets with uneven or absent topic representations, ABLiMA uses the Beta-Liouville distribution to adjust for topic distribution variability and sparsity. By comparing these models to common datasets like the NIPS and 20 Newsgroups datasets, the research presented here demonstrates how well these models manage sparsity, capture complex theme preferences, and generate coherent subjects. The results show that the models can be applied to many situations. Coherence measure and author-topic relationship visualizations further validate their interpretability and usefulness.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Tahsin, Faiza |
Institution: | Concordia University |
Degree Name: | M.A. Sc. |
Program: | Information Systems Security |
Date: | 20 January 2025 |
Thesis Supervisor(s): | Bouguila, Nizar |
ID Code: | 995066 |
Deposited By: | Faiza Tahsin |
Deposited On: | 17 Jun 2025 17:28 |
Last Modified: | 17 Jun 2025 17:28 |
Repository Staff Only: item control page