Login | Register

Structured Probabilistic Latent Modeling for Representation Learning and 3D Perception

Title:

Structured Probabilistic Latent Modeling for Representation Learning and 3D Perception

Guo, Jiaxun (2026) Structured Probabilistic Latent Modeling for Representation Learning and 3D Perception. PhD thesis, Concordia University.

[thumbnail of Guo_PhD_S2026.pdf]
Preview
Text (application/pdf)
Guo_PhD_S2026.pdf - Accepted Version
Available under License Spectrum Terms of Access.
19MB

Abstract

Traditional representation learning and 3D perception methods commonly rely on Gaussian priors in latent spaces, which are inadequate for modeling complex structural asymmetries, directional uncertainty, and rotational symmetries. This thesis introduces structured probabilistic latent models with non-Gaussian distributions to address these challenges, enabling uncertainty-aware representation learning and robust 3D geometric perception. The research first investigates foundational probabilistic modeling for latent representations. We propose GamMM-VAE, a deep generative clustering framework that employs an asymmetric Gamma mixture prior together with a novel reparameterization to learn high-quality latent embeddings for non-negative vectors. To further model directional and axially symmetric data, we develop VIBinMM, a scalable variational inference framework for Bingham mixture models. By exploiting the mathematical relationship between Bingham and Gaussian distributions, the proposed approach bypasses intractable normalization constants and enables efficient inference in high-dimensional directional spaces. Building upon these foundations, the thesis extends structured latent priors to rotation-invariant 3D perception. We introduce the Shadow-informed Pose Feature (SiPF) and the Ga4DPF framework, which leverage Bingham distributions over unit quaternions to dynamically model global pose references. These mechanisms preserve strict rotation invariance while resolving local geometric ambiguities and mitigating feature collapse in symmetric structures. In addition, we present BRVSNet, a self-supervised framework for probabilistic rotation estimation and consistent multi-view generation. Extensive experiments on multiple benchmarks demonstrate that incorporating structured, non-Gaussian priors, particularly Gamma and Bingham distributions, leads to more expressive latent representations and robust, uncertainty-aware perception. The proposed models consistently outperform state-of-the-art methods across tasks ranging from unsupervised learning to fine-grained 3D spatial discrimination.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:Thesis (PhD)
Authors:Guo, Jiaxun
Institution:Concordia University
Degree Name:Ph. D.
Program:Information and Systems Engineering
Date:10 February 2026
Thesis Supervisor(s):Nizar, Bouguila and Wentao, Fan and Manar, Amayri
ID Code:997071
Deposited By: Jiaxun Guo
Deposited On:29 Jun 2026 17:53
Last Modified:29 Jun 2026 17:53
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top