Shahjahan, Abu Taib Mohammed (2025) Learning Flexible Graph Representations for 3D Human Pose Estimation. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
3MBShahjahan_MA_F2025.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Accurate 3D human pose estimation remains a significant challenge in computer vision, especially under occlusions, complex joint articulation, and depth ambiguities. Graph Convolutional Network (GCN)-based methods have proven effective by modeling the human skeleton as a graph of joints and bones. However, standard GCNs are limited by one-hop neighbor aggregation, as well as spectral bias, which emphasizes low-frequency features while overlooking fine-grained motion. This thesis addresses these issues by introducing flexible graph convolutional network (Flex-GCN), a novel architecture that enhances spatial awareness through multi-hop aggregation controlled by a scaling parameter. Flex-GCN integrates residual graph convolutional blocks and a global response normalization layer to improve feature selectivity and contextual understanding. Moreover, adjacency modulation enables dynamic graph restructuring, allowing better representation of distant joint relationships. Building upon these findings, the second part of this thesis introduces the
Flexible Graph Kolmogorov-Arnold Network (FG-KAN), a more expressive framework that integrates the Kolmogorov-Arnold Network (KAN) with graph-based learning. FG-KAN replaces the fixed activation functions in standard GCNs with learnable, univariate functions applied directly to graph edges, enhancing both interpretability and adaptability, which not only mitigates spectral bias but also enables the model to capture fine-grained joint dynamics crucial for accurately estimating complex and fast body movements. FG-KAN incorporates residual connections, scalable multi-hop feature aggregation, and symmetric adjacency modulation, ensuring both computational efficiency and improved generalization. Comprehensive experimental evaluations on benchmark datasets such as Human3.6M and MPI-INF-3DHP demonstrate that both Flex-GCN and FG-KAN outperform competing baseline methods in terms of Mean Per Joint Position Error, Procrustes Aligned Mean Per Joint Position Error, and Percentage of Correct Keypoints. Notably, while both
models demonstrate strong robustness, interpretability is a distinct advantage of FG-KAN, as evidenced by qualitative visualizations and ablation studies.
| Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering |
|---|---|
| Item Type: | Thesis (Masters) |
| Authors: | Shahjahan, Abu Taib Mohammed |
| Institution: | Concordia University |
| Degree Name: | M.A. Sc. |
| Program: | Quality Systems Engineering |
| Date: | 11 July 2025 |
| Thesis Supervisor(s): | Ben Hamza, Abdessamad |
| ID Code: | 995731 |
| Deposited By: | Abu Taib Mohammed Shahjahan |
| Deposited On: | 04 Nov 2025 17:41 |
| Last Modified: | 04 Nov 2025 17:41 |
Repository Staff Only: item control page


Download Statistics
Download Statistics