Learning Flexible Graph Representations for 3D Human Pose Estimation

Title:

Learning Flexible Graph Representations for 3D Human Pose Estimation

Shahjahan, Abu Taib Mohammed (2025) Learning Flexible Graph Representations for 3D Human Pose Estimation. Masters thesis, Concordia University.

Preview

Text (application/pdf)
Shahjahan_MA_F2025.pdf - Accepted Version
Available under License Spectrum Terms of Access.

3MB

Abstract

Accurate 3D human pose estimation remains a significant challenge in computer vision, especially under occlusions, complex joint articulation, and depth ambiguities. Graph Convolutional Network (GCN)-based methods have proven effective by modeling the human skeleton as a graph of joints and bones. However, standard GCNs are limited by one-hop neighbor aggregation, as well as spectral bias, which emphasizes low-frequency features while overlooking ﬁne-grained motion. This thesis addresses these issues by introducing flexible graph convolutional network (Flex-GCN), a novel architecture that enhances spatial awareness through multi-hop aggregation controlled by a scaling parameter. Flex-GCN integrates residual graph convolutional blocks and a global response normalization layer to improve feature selectivity and contextual understanding. Moreover, adjacency modulation enables dynamic graph restructuring, allowing better representation of distant joint relationships. Building upon these findings, the second part of this thesis introduces the
Flexible Graph Kolmogorov-Arnold Network (FG-KAN), a more expressive framework that integrates the Kolmogorov-Arnold Network (KAN) with graph-based learning. FG-KAN replaces the fixed activation functions in standard GCNs with learnable, univariate functions applied directly to graph edges, enhancing both interpretability and adaptability, which not only mitigates spectral bias but also enables the model to capture ﬁne-grained joint dynamics crucial for accurately estimating complex and fast body movements. FG-KAN incorporates residual connections, scalable multi-hop feature aggregation, and symmetric adjacency modulation, ensuring both computational efficiency and improved generalization. Comprehensive experimental evaluations on benchmark datasets such as Human3.6M and MPI-INF-3DHP demonstrate that both Flex-GCN and FG-KAN outperform competing baseline methods in terms of Mean Per Joint Position Error, Procrustes Aligned Mean Per Joint Position Error, and Percentage of Correct Keypoints. Notably, while both
models demonstrate strong robustness, interpretability is a distinct advantage of FG-KAN, as evidenced by qualitative visualizations and ablation studies.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:	Thesis (Masters)
Authors:	Shahjahan, Abu Taib Mohammed
Institution:	Concordia University
Degree Name:	M.A. Sc.
Program:	Quality Systems Engineering
Date:	11 July 2025
Thesis Supervisor(s):	Ben Hamza, Abdessamad
ID Code:	995731
Deposited By:	Abu Taib Mohammed Shahjahan
Deposited On:	04 Nov 2025 17:41
Last Modified:	04 Nov 2025 17:41

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

Learning Flexible Graph Representations for 3D Human Pose Estimation

Learning Flexible Graph Representations for 3D Human Pose Estimation

Abstract