Patient-Level Representation Learning for Computational Pathology

Title:

Patient-Level Representation Learning for Computational Pathology

Hassan, Yousef (2026) Patient-Level Representation Learning for Computational Pathology. Masters thesis, Concordia University.

Preview

Text (application/pdf)
Hassan_MCompSc_F2026.pdf - Accepted Version
Available under License Spectrum Terms of Access.

8MB

Abstract

Computational pathology requires whole-slide image (WSI) foundation models that transfer across diverse clinical tasks, yet current approaches remain largely slide-centric, often depend on proprietary data and expensive supervision from paired textual reports that are not publicly available, and do not explicitly model relationships among multiple slides from the same patient. This thesis presents MOOZY, a patient-first pathology foundation model in which the patient case, not the individual slide, serves as the fundamental unit of representation. Unlike existing methods that encode slides independently and merge their embeddings post-hoc, MOOZY explicitly models dependencies across all slides from the same patient via a dedicated case transformer during pretraining.

MOOZY follows a two-stage design that combines multi-stage open self-supervision with scaled low-cost task supervision. In Stage~1, a vision-only slide encoder is pretrained on 77,134 public slide feature grids using masked self-distillation to learn robust, context-aware representations. In Stage~2, these representations are aligned with clinical semantics through a case transformer and multi-task supervision over 333 tasks from 56 public datasets, spanning 205 classification tasks and 128 survival tasks that cover overall survival, disease-specific survival, disease-free interval, and progression-free interval across 23 anatomical sites.

Across eight held-out tasks evaluated with five-fold frozen-feature probing, MOOZY achieves best or tied-best performance on the majority of metrics, improving macro averages over TITAN by +7.37\%, +5.50\%, and +7.83\% and over PRISM by +8.83\%, +10.70\%, and +9.78\% for weighted F1, weighted ROC-AUC, and balanced accuracy, respectively. MOOZY is also parameter-efficient, with 85.77M total parameters, 14.23 times smaller than GigaPath, while maintaining strong performance. These results demonstrate that open and reproducible patient-level pretraining is sufficient to produce transferable, generalizable, and parameter-efficient embeddings, providing a practical path toward scalable patient-first histopathology foundation models.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:	Thesis (Masters)
Authors:	Hassan, Yousef
Institution:	Concordia University
Degree Name:	M. Comp. Sc.
Program:	Computer Science
Date:	April 2026
Thesis Supervisor(s):	Mahdi, Hosseini and Pal, Chris
ID Code:	997131
Deposited By:	Yousef Hassan
Deposited On:	29 Jun 2026 14:56
Last Modified:	29 Jun 2026 14:56

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

Patient-Level Representation Learning for Computational Pathology

Patient-Level Representation Learning for Computational Pathology

Abstract