Login | Register

Graph Neural Networks For 3D Human Pose Estimation


Graph Neural Networks For 3D Human Pose Estimation

Hassan, Md. Tanvir (2023) Graph Neural Networks For 3D Human Pose Estimation. Masters thesis, Concordia University.

[thumbnail of Hassan_MASc_S2023.pdf]
Text (application/pdf)
Hassan_MASc_S2023.pdf - Accepted Version
Available under License Spectrum Terms of Access.


In human pose estimation methods based on graph convolutional architectures, the human skeleton is usually modeled as a graph whose nodes are body joints and edges are connections between neighboring joints. However, most of these methods tend to focus on learning relationships between body joints of the skeleton using first-order neighbors, ignoring higher-order neighbors and hence limiting their ability to exploit relationships between distant joints. In this thesis, we introduce a higher-order regular splitting graph network (RS-Net) for 2D-to-3D human pose estimation using matrix splitting in conjunction with weight and adjacency modulation. The core idea is to capture long-range dependencies between body joints using multi-hop neighborhoods and also to learn different modulation vectors for different body joints as well as a modulation matrix added to the adjacency matrix associated to the skeleton. This learnable modulation matrix helps adjust the graph structure by adding extra graph edges in an effort to learn additional connections between body joints. Instead of using a shared weight matrix for all neighboring body joints, the proposed RS-Net model applies weight unsharing before aggregating the feature vectors associated to the joints in order to capture the different relations between them. Experiments and ablations studies performed on two benchmark datasets demonstrate the effectiveness of our model, achieving superior performance over strong baselines for 3D human pose estimation.

The other contribution of this thesis consists of designing a spatio-temporal 3D human pose estimation model using multilayer perceptrons and graph neural networks. Despite the success of graph convolutional networks and their variants in 3D human pose estimation tasks, most of these methods only consider spatial correlations between body joints and do not take into account temporal correlations, thereby limiting their ability to capture relationships in the presence of occlusions and inherent ambiguity. To address this issue, we propose a spatio-temporal network architecture composed of a joints-mixing multi-layer perceptron block that facilitates communication among different joints and a graph weighted Jacobi network block that enables communication among
various feature channels. Extensive experiments on two benchmark datasets demonstrate the competitive performance of our model, outperforming recent state-of-the-art methods for 3D human pose estimation. In addition, we perform a runtime analysis and conduct a comprehensive ablation study to show the effect of the key components of our model.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:Thesis (Masters)
Authors:Hassan, Md. Tanvir
Institution:Concordia University
Degree Name:M.A. Sc.
Program:Quality Systems Engineering
Date:5 April 2023
Thesis Supervisor(s):Ben Hamza, Abdessamad
ID Code:992048
Deposited By: Md. Tanvir Hassan
Deposited On:21 Jun 2023 14:34
Last Modified:21 Jun 2023 14:34
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top