Martins Gomes, Damien (2025) Towards Practical Second-Order Optimizers in Deep Learning: Insights from Fisher Information Analysis. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
18MBMartinsGomes_Msc_S2025.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
First-order optimization methods remain the standard for training deep neural networks (DNNs). Optimizers like Adam incorporate limited curvature information by preconditioning the stochastic gradient with a diagonal matrix. Despite the widespread adoption of first-order methods, second-order optimization algorithms often exhibit superior convergence compared to methods like Adam and SGD. However, their practicality in training DNNs is still limited by a significantly higher per-iteration computational cost compared to first-order methods. In this thesis, we present AdaFisher, a novel adaptive second-order optimizer that leverages a diagonal block-Kronecker approximation of the Fisher information matrix to adaptively precondition gradients. AdaFisher aims to bridge the gap between the improved convergence and generalization of second-order methods and the computational efficiency needed for training DNNs. Despite the traditionally slower speed of second-order optimizers, AdaFisher is effective for tasks such as image classification and language modeling, exhibiting remarkable stability and robustness during hyperparameter tuning. We demonstrate that AdaFisher outperforms state-of-the-art optimizers in both accuracy and convergence speed. The
Code is available from https://github.com/AtlasAnalyticsLab/AdaFisher.
| Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
|---|---|
| Item Type: | Thesis (Masters) |
| Authors: | Martins Gomes, Damien |
| Institution: | Concordia University |
| Degree Name: | M.A. |
| Program: | Computer Science |
| Date: | 26 March 2025 |
| Thesis Supervisor(s): | Hosseini, Mahdi S. |
| Keywords: | Second Order Optimization, Fisher Information, Kronecker-factored Approximate Curvature, Deep Learning, Computer Vision, Natural Language Processing |
| ID Code: | 995445 |
| Deposited By: | Damien Martins Gomes |
| Deposited On: | 17 Jun 2025 17:34 |
| Last Modified: | 17 Jun 2025 17:34 |
| Related URLs: |
Repository Staff Only: item control page


Download Statistics
Download Statistics