Guo, Jiaxun (2026) Structured Probabilistic Latent Modeling for Representation Learning and 3D Perception. PhD thesis, Concordia University.
Preview |
Text (application/pdf)
19MBGuo_PhD_S2026.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Traditional representation learning and 3D perception methods commonly rely on Gaussian priors in latent spaces, which are inadequate for modeling complex structural asymmetries, directional uncertainty, and rotational symmetries. This thesis introduces structured probabilistic latent models with non-Gaussian distributions to address these challenges, enabling uncertainty-aware representation learning and robust 3D geometric perception. The research first investigates foundational probabilistic modeling for latent representations. We propose GamMM-VAE, a deep generative clustering framework that employs an asymmetric Gamma mixture prior together with a novel reparameterization to learn high-quality latent embeddings for non-negative vectors. To further model directional and axially symmetric data, we develop VIBinMM, a scalable variational inference framework for Bingham mixture models. By exploiting the mathematical relationship between Bingham and Gaussian distributions, the proposed approach bypasses intractable normalization constants and enables efficient inference in high-dimensional directional spaces. Building upon these foundations, the thesis extends structured latent priors to rotation-invariant 3D perception. We introduce the Shadow-informed Pose Feature (SiPF) and the Ga4DPF framework, which leverage Bingham distributions over unit quaternions to dynamically model global pose references. These mechanisms preserve strict rotation invariance while resolving local geometric ambiguities and mitigating feature collapse in symmetric structures. In addition, we present BRVSNet, a self-supervised framework for probabilistic rotation estimation and consistent multi-view generation. Extensive experiments on multiple benchmarks demonstrate that incorporating structured, non-Gaussian priors, particularly Gamma and Bingham distributions, leads to more expressive latent representations and robust, uncertainty-aware perception. The proposed models consistently outperform state-of-the-art methods across tasks ranging from unsupervised learning to fine-grained 3D spatial discrimination.
| Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering |
|---|---|
| Item Type: | Thesis (PhD) |
| Authors: | Guo, Jiaxun |
| Institution: | Concordia University |
| Degree Name: | Ph. D. |
| Program: | Information and Systems Engineering |
| Date: | 10 February 2026 |
| Thesis Supervisor(s): | Nizar, Bouguila and Wentao, Fan and Manar, Amayri |
| ID Code: | 997071 |
| Deposited By: | Jiaxun Guo |
| Deposited On: | 29 Jun 2026 17:53 |
| Last Modified: | 29 Jun 2026 17:53 |
Repository Staff Only: item control page


Download Statistics
Download Statistics