Login | Register

Towards Practical Second-Order Optimizers in Deep Learning: Insights from Fisher Information Analysis

Title:

Towards Practical Second-Order Optimizers in Deep Learning: Insights from Fisher Information Analysis

Martins Gomes, Damien (2025) Towards Practical Second-Order Optimizers in Deep Learning: Insights from Fisher Information Analysis. Masters thesis, Concordia University.

[thumbnail of MartinsGomes_Msc_S2025.pdf]
Preview
Text (application/pdf)
MartinsGomes_Msc_S2025.pdf - Accepted Version
Available under License Spectrum Terms of Access.
18MB

Abstract

First-order optimization methods remain the standard for training deep neural networks (DNNs). Optimizers like Adam incorporate limited curvature information by preconditioning the stochastic gradient with a diagonal matrix. Despite the widespread adoption of first-order methods, second-order optimization algorithms often exhibit superior convergence compared to methods like Adam and SGD. However, their practicality in training DNNs is still limited by a significantly higher per-iteration computational cost compared to first-order methods. In this thesis, we present AdaFisher, a novel adaptive second-order optimizer that leverages a diagonal block-Kronecker approximation of the Fisher information matrix to adaptively precondition gradients. AdaFisher aims to bridge the gap between the improved convergence and generalization of second-order methods and the computational efficiency needed for training DNNs. Despite the traditionally slower speed of second-order optimizers, AdaFisher is effective for tasks such as image classification and language modeling, exhibiting remarkable stability and robustness during hyperparameter tuning. We demonstrate that AdaFisher outperforms state-of-the-art optimizers in both accuracy and convergence speed. The
Code is available from https://github.com/AtlasAnalyticsLab/AdaFisher.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Martins Gomes, Damien
Institution:Concordia University
Degree Name:M.A.
Program:Computer Science
Date:26 March 2025
Thesis Supervisor(s):Hosseini, Mahdi S.
Keywords:Second Order Optimization, Fisher Information, Kronecker-factored Approximate Curvature, Deep Learning, Computer Vision, Natural Language Processing
ID Code:995445
Deposited By: Damien Martins Gomes
Deposited On:17 Jun 2025 17:34
Last Modified:17 Jun 2025 17:34
Related URLs:
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top