Hassan, Yousef (2026) Patient-Level Representation Learning for Computational Pathology. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
8MBHassan_MCompSc_F2026.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Computational pathology requires whole-slide image (WSI) foundation models that transfer across diverse clinical tasks, yet current approaches remain largely slide-centric, often depend on proprietary data and expensive supervision from paired textual reports that are not publicly available, and do not explicitly model relationships among multiple slides from the same patient. This thesis presents MOOZY, a patient-first pathology foundation model in which the patient case, not the individual slide, serves as the fundamental unit of representation. Unlike existing methods that encode slides independently and merge their embeddings post-hoc, MOOZY explicitly models dependencies across all slides from the same patient via a dedicated case transformer during pretraining.
MOOZY follows a two-stage design that combines multi-stage open self-supervision with scaled low-cost task supervision. In Stage~1, a vision-only slide encoder is pretrained on 77,134 public slide feature grids using masked self-distillation to learn robust, context-aware representations. In Stage~2, these representations are aligned with clinical semantics through a case transformer and multi-task supervision over 333 tasks from 56 public datasets, spanning 205 classification tasks and 128 survival tasks that cover overall survival, disease-specific survival, disease-free interval, and progression-free interval across 23 anatomical sites.
Across eight held-out tasks evaluated with five-fold frozen-feature probing, MOOZY achieves best or tied-best performance on the majority of metrics, improving macro averages over TITAN by +7.37\%, +5.50\%, and +7.83\% and over PRISM by +8.83\%, +10.70\%, and +9.78\% for weighted F1, weighted ROC-AUC, and balanced accuracy, respectively. MOOZY is also parameter-efficient, with 85.77M total parameters, 14.23 times smaller than GigaPath, while maintaining strong performance. These results demonstrate that open and reproducible patient-level pretraining is sufficient to produce transferable, generalizable, and parameter-efficient embeddings, providing a practical path toward scalable patient-first histopathology foundation models.
| Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
|---|---|
| Item Type: | Thesis (Masters) |
| Authors: | Hassan, Yousef |
| Institution: | Concordia University |
| Degree Name: | M. Comp. Sc. |
| Program: | Computer Science |
| Date: | April 2026 |
| Thesis Supervisor(s): | Mahdi, Hosseini and Pal, Chris |
| ID Code: | 997131 |
| Deposited By: | Yousef Hassan |
| Deposited On: | 29 Jun 2026 14:56 |
| Last Modified: | 29 Jun 2026 14:56 |
Repository Staff Only: item control page


Download Statistics
Download Statistics