Zgaren, Ahmed
ORCID: https://orcid.org/0000-0001-5777-3440
(2025)
Context Mining for Visual Object Counting.
PhD thesis, Concordia University.
Text (application/pdf)
33MBZgaren_PhD_F2025.pdf - Accepted Version Restricted to Registered users only until 1 May 2026. Available under License Spectrum Terms of Access. |
Abstract
Visual object counting is a fundamental task in computer vision that aims to accurately estimate the number of objects of interest within an image. This task has widespread applications across various domains, including environmental monitoring, surveillance, retail analytics, and medical imaging. Traditional counting methods often face challenges such as object occlusion, variation in scale and appearance, and complex scene backgrounds. Although deep learning has significantly advanced this field, there are still limitations, particularly regarding the accurate capture of contextual information.
This thesis focuses on developing novel approaches to enhance visual object counting, targeting key research problems related to accuracy, efficiency, and robustness in both class-specific and class-agnostic counting scenarios. To address these challenges, this thesis makes several key contributions. First, we propose a novel hybrid counting method that combines local detection with global estimation to accurately count objects in aerial imagery. This approach efficiently exploits both local and global information, enhancing counting accuracy in high-density situations. Second, we introduce a self-attention-based model for class-agnostic counting, which effectively encodes repetitive object patterns, allowing for precise counting even in the presence of object variations and background clutter. This method improves feature representation and matching, leading to enhanced robustness and generalization capabilities. Finally, we present a novel box-free counting model requiring only one annotation point per object, significantly reducing the annotation task. This method employs contextual transformers and a position-aware attention encoder to achieve accurate object counts with minimal annotation effort. The effectiveness of our proposed methods is rigorously demonstrated through extensive experiments conducted on both public and private datasets. By comparing our results to those achieved by state-of-the-art methods, we showcase the superior performance of our approaches in addressing several challenges in visual object counting. These contributions collectively advance the field of visual object counting by providing more accurate, efficient, and robust counting methods, opening new possibilities for automated object counting in various applications.
| Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Concordia Institute for Information Systems Engineering |
|---|---|
| Item Type: | Thesis (PhD) |
| Authors: | Zgaren, Ahmed |
| Institution: | Concordia University |
| Degree Name: | Ph. D. |
| Program: | Information and Systems Engineering |
| Date: | 1 April 2025 |
| Thesis Supervisor(s): | Bouguila, Nizar and Bouachir, Wassim |
| ID Code: | 995586 |
| Deposited By: | Ahmed Zgaren |
| Deposited On: | 04 Nov 2025 16:48 |
| Last Modified: | 04 Nov 2025 16:48 |
Repository Staff Only: item control page


Download Statistics
Download Statistics