Login | Register

Novel Deep Learning Models for Radar-Based Human Activity Recognition

Title:

Novel Deep Learning Models for Radar-Based Human Activity Recognition

Pan, Keyu (2025) Novel Deep Learning Models for Radar-Based Human Activity Recognition. PhD thesis, Concordia University.

[thumbnail of Pan_PhD_F2025.pdf]
Text (application/pdf)
Pan_PhD_F2025.pdf - Accepted Version
Restricted to Repository staff only until 31 August 2027.
Available under License Spectrum Terms of Access.
74MB

Abstract

This dissertation presents a comprehensive deep learning pipeline addressing key challenges in radar-based human activity recognition (HAR). Radar-based radio-frequency sensing technology offers distinct advantages such as privacy preservation and robustness to illumination changes. However, its practical deployment is hindered by a number of interrelated barriers including constrained computational resources, degradation of feature resolution due to traditional convolution and pooling, class imbalance affecting rare yet critical actions, catastrophic forgetting in incremental learning, and strict frame-level segmentation requirements under latency constraints. To address these challenges, the proposed work progresses coherently across five dimensions as described below.

To manage stringent on-device computational and memory constraints, the first contribution is a compact spatial-channel attention convolutional neural network (CNN) trained on a newly collected ultra-wideband (UWB) “similar-motion” dataset. Despite its lightweight design with only 90K parameters and 2.7 million FLOPs, the network achieves 94% accuracy, outperforming classical baseline models by 4% to 20%. This establishes a foundational framework that balances computational efficiency with discriminative power.

To mitigate the loss of feature resolution from traditional convolution and pooling operations, a dual-branch StarNet/HWDLD-ConvFormer architecture is introduced. StarNet employs a multiscale attention mechanism to improve UWB time-domain envelope recognition accuracy to 95.8%, while the HWDLD-ConvFormer, leveraging Haar-wavelet downsampling, linear deformable convolutions, and sparse self-attention, pushes performance to 99.7% and reduces computational cost by 20%. Applying this architecture to normalized FMCW micro-Doppler spectrograms yields similarly impressive results, with 99.1% accuracy and 99.6% recall, without increasing computational overhead.

With high-resolution features secured, a novel three-stage few-shot learning (FSL) framework is proposed to address class imbalance. This approach integrates meta-learned prototypes, decomposition-guided encoders, and entropy-aware self-distillation to significantly enhance performance on underrepresented but safety-critical actions. On the UoG benchmark dataset, the model improves five-shot classification accuracy from 70.5% to 82.5%, and up to 92.2% with 20-shot settings. It also consistently delivers 10% to 16% gains across four additional datasets, effectively reducing annotation burdens for rare activity classes.

Building on these FSL advancements, the dissertation also tackles catastrophic forgetting in incremental learning through an adaptive prompt-driven few-shot class-incremental learning (FSCIL) framework. By integrating both task-invariant and task-specific prompts into a hybrid attention backbone and enforcing representational consistency via self-distillation, the proposed FSCIL model enables batch-wise incremental learning across multiple radar datasets. It achieves a top-1 accuracy of 86.7% after six incremental sessions, representing only a 10.8% drop from the initial state, and outperforms existing radar-based methods by over 11%.

Finally, synthesizing these contributions, the dissertation presents an end-to-end framework for continuous radar-based HAR. Powered by an adaptive temporal convolutional network enhanced with exponential moving average (EMA)-gated attention and multi-stage refinement, the system achieves 96.09% frame-level accuracy and an 87.12% segmental F1 score at 0.5 IoU, all while maintaining latency under 25 milliseconds per frame. This performance significantly exceeds that of contemporary transformer- and detection-based methods.

Together, these innovations form a unified, computationally efficient, resolution-preserving, and incrementally robust HAR pipeline. The proposed techniques offer a promising and ethically aligned solution for privacy-sensitive applications in healthcare, elder care, and smart home environments, where traditional camera-based systems are often impractical or intrusive.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering
Item Type:Thesis (PhD)
Authors:Pan, Keyu
Institution:Concordia University
Degree Name:Ph. D.
Program:Electrical and Computer Engineering
Date:July 2025
Thesis Supervisor(s):Zhu, Wei-Ping
ID Code:996279
Deposited By: Keyu Pan
Deposited On:29 Jun 2026 17:32
Last Modified:29 Jun 2026 17:32
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top