Pan, Keyu (2025) Novel Deep Learning Models for Radar-Based Human Activity Recognition. PhD thesis, Concordia University.
Text (application/pdf)
74MBPan_PhD_F2025.pdf - Accepted Version Restricted to Repository staff only until 31 August 2027. Available under License Spectrum Terms of Access. |
Abstract
This dissertation presents a comprehensive deep learning pipeline addressing key challenges in radar-based human activity recognition (HAR). Radar-based radio-frequency sensing technology offers distinct advantages such as privacy preservation and robustness to illumination changes. However, its practical deployment is hindered by a number of interrelated barriers including constrained computational resources, degradation of feature resolution due to traditional convolution and pooling, class imbalance affecting rare yet critical actions, catastrophic forgetting in incremental learning, and strict frame-level segmentation requirements under latency constraints. To address these challenges, the proposed work progresses coherently across five dimensions as described below.
To manage stringent on-device computational and memory constraints, the first contribution is a compact spatial-channel attention convolutional neural network (CNN) trained on a newly collected ultra-wideband (UWB) “similar-motion” dataset. Despite its lightweight design with only 90K parameters and 2.7 million FLOPs, the network achieves 94% accuracy, outperforming classical baseline models by 4% to 20%. This establishes a foundational framework that balances computational efficiency with discriminative power.
To mitigate the loss of feature resolution from traditional convolution and pooling operations, a dual-branch StarNet/HWDLD-ConvFormer architecture is introduced. StarNet employs a multiscale attention mechanism to improve UWB time-domain envelope recognition accuracy to 95.8%, while the HWDLD-ConvFormer, leveraging Haar-wavelet downsampling, linear deformable convolutions, and sparse self-attention, pushes performance to 99.7% and reduces computational cost by 20%. Applying this architecture to normalized FMCW micro-Doppler spectrograms yields similarly impressive results, with 99.1% accuracy and 99.6% recall, without increasing computational overhead.
With high-resolution features secured, a novel three-stage few-shot learning (FSL) framework is proposed to address class imbalance. This approach integrates meta-learned prototypes, decomposition-guided encoders, and entropy-aware self-distillation to significantly enhance performance on underrepresented but safety-critical actions. On the UoG benchmark dataset, the model improves five-shot classification accuracy from 70.5% to 82.5%, and up to 92.2% with 20-shot settings. It also consistently delivers 10% to 16% gains across four additional datasets, effectively reducing annotation burdens for rare activity classes.
Building on these FSL advancements, the dissertation also tackles catastrophic forgetting in incremental learning through an adaptive prompt-driven few-shot class-incremental learning (FSCIL) framework. By integrating both task-invariant and task-specific prompts into a hybrid attention backbone and enforcing representational consistency via self-distillation, the proposed FSCIL model enables batch-wise incremental learning across multiple radar datasets. It achieves a top-1 accuracy of 86.7% after six incremental sessions, representing only a 10.8% drop from the initial state, and outperforms existing radar-based methods by over 11%.
Finally, synthesizing these contributions, the dissertation presents an end-to-end framework for continuous radar-based HAR. Powered by an adaptive temporal convolutional network enhanced with exponential moving average (EMA)-gated attention and multi-stage refinement, the system achieves 96.09% frame-level accuracy and an 87.12% segmental F1 score at 0.5 IoU, all while maintaining latency under 25 milliseconds per frame. This performance significantly exceeds that of contemporary transformer- and detection-based methods.
Together, these innovations form a unified, computationally efficient, resolution-preserving, and incrementally robust HAR pipeline. The proposed techniques offer a promising and ethically aligned solution for privacy-sensitive applications in healthcare, elder care, and smart home environments, where traditional camera-based systems are often impractical or intrusive.
| Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering |
|---|---|
| Item Type: | Thesis (PhD) |
| Authors: | Pan, Keyu |
| Institution: | Concordia University |
| Degree Name: | Ph. D. |
| Program: | Electrical and Computer Engineering |
| Date: | July 2025 |
| Thesis Supervisor(s): | Zhu, Wei-Ping |
| ID Code: | 996279 |
| Deposited By: | Keyu Pan |
| Deposited On: | 29 Jun 2026 17:32 |
| Last Modified: | 29 Jun 2026 17:32 |
Repository Staff Only: item control page


Download Statistics
Download Statistics