Login | Register

Pose Estimation and Object Detection using Deep Convolutional Networks


Pose Estimation and Object Detection using Deep Convolutional Networks

Quan, Jianning (2021) Pose Estimation and Object Detection using Deep Convolutional Networks. Masters thesis, Concordia University.

[thumbnail of Quan_MASc_F2021.pdf]
Text (application/pdf)
Quan_MASc_F2021.pdf - Accepted Version


Human pose estimation and object detection are fundamental problems in computer vision and autonomous systems with applications ranging from healthcare and sports to surveillance, autonomous driving and traffic monitoring. The task of 3D human pose estimation is to predict the positions of a person’s joints, while the goal of object detection is to identify the object category and locate the position using a bounding box for every known object within an image or video. The contributions in this thesis are two-fold. One is to tackle the 3D human pose estimation problem in the graph-theoretic setting. More specifically, we introduce a higher-order graph convolutional framework with initial residual connections for 3D-to-2D pose estimation. The proposed approach is derived from implicit fairing on graphs using a scale-dependent graph Laplacian filtering scheme. Using multi-hop neighborhoods for node feature aggregation, our model is able to capture the long-range dependencies between body joints. Moreover, our approach alleviates the oversmoothing problem caused by repeated graph convolutions, preventing the learned feature
representations from converging to similar values thanks in part to residual connections with the first layer of the network. These residual connections are integrated by design in our network architecture, and help ensure that the learned feature representations retain important information from the initial features of the input layer as the network depth increases. Experiments and ablations studies conducted on a standard benchmark demonstrate the effectiveness of our model, achieving superior performance over strong baseline methods for 3D human pose estimation.
The other contribution consists of designing a single-stage object detection model for aerial imagery using a class-balanced loss function in conjunction with a feature pyramid network in an effort to mitigate the data imbalance problem without the need to rely on data augmentation. The
key benefit of using the class-balanced focal loss is the ability to adjust the contributions of minority classes to the loss function with the aim to tackle the class imbalance problem, allowing our model to detect different classes evenly. The performance of our proposed object detection model is demonstrated through extensive experiments on a standard aerial image benchmark, achieving comparable or better object detection results in comparison with competing baselines.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering
Item Type:Thesis (Masters)
Authors:Quan, Jianning
Institution:Concordia University
Degree Name:M.A. Sc.
Program:Quality Systems Engineering
Date:30 July 2021
Thesis Supervisor(s):Ben Hamza, Abdessamad
ID Code:988613
Deposited By: Jianning Quan
Deposited On:29 Nov 2021 17:03
Last Modified:29 Nov 2021 17:03
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top