Karimpour, Zahra (2025) Early Layer Optimization. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
2MBKarimpour_MCompSc_S2025.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
In deep learning, early layers play a fundamental role in building general and transferable representations. In this thesis, we demonstrate how improving early layer features can consistently enhance generalization across diverse training settings.
First, we propose a novel iterative training method called Simulated Annealing in Early Layers (SEAL), which applies intermittent gradient ascent followed by descent to the early layers during training. This enables the early layers to escape local minima and refine their representations over time. Doing so reduces overfitting leading to state-of-the-art in in-distribution and transfer generalization in iterative training regime.
Second, we observed poor transfer generalization in greedy learning which we attribute to the lack of generic information especially in the early layers of the network. To address this, we utilize CS-KD regularization to encourage information gain in the early layers. Our results show that this adjustment mitigates the transfer performance drop typically observed in greedy training, while maintaining in-distribution accuracy.
Finally, we extend our investigation to federated learning, where early layer divergence due to gradient accumulation across clients can lead to poor representation learning, even under IID data distributions. We demonstrate that greedy training, by avoiding end-to-end backpropagation, mitigates divergence in the early layers and improves overall performance, particularly in challenging scenarios with deeper models or many clients.
Overall, this thesis highlights the importance of early layer learning in building models that generalize well, and introduces practical strategies for improving it across iterative, greedy, and federated learning paradigms.
| Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
|---|---|
| Item Type: | Thesis (Masters) |
| Authors: | Karimpour, Zahra |
| Institution: | Concordia University |
| Degree Name: | M. Comp. Sc. |
| Program: | Computer Science |
| Date: | April 2025 |
| Thesis Supervisor(s): | Mudur, Sudhir |
| ID Code: | 995516 |
| Deposited By: | Zahra Karimpour |
| Deposited On: | 04 Nov 2025 15:39 |
| Last Modified: | 04 Nov 2025 15:39 |
Repository Staff Only: item control page


Download Statistics
Download Statistics