Login | Register

New Insights on Catastrophic Forgetting in Neural Networks


New Insights on Catastrophic Forgetting in Neural Networks

Asadi, Nader (2023) New Insights on Catastrophic Forgetting in Neural Networks. Masters thesis, Concordia University.

[thumbnail of Nader_Masters_Thesis.pdf]
Text (application/pdf)
Nader_Masters_Thesis.pdf - Accepted Version
Available under License Spectrum Terms of Access.


Continual learning, the ability of agents to learn from a changing distribution of data while respecting memory and compute constraints, is a challenging and important problem in machine learning. The central challenge of Continual Learning (CL) is to balance effective adaptation of new information while combating catastrophic forgetting, \textit{i.e.} stability-plasticity dilemma. This thesis is comprised of four major chapters that explore different aspects of continual learning and propose novel solutions to address some of its challenges.

First, we investigate the impact of experience replay (ER) on the change in representations of observed data that arises when previously unobserved classes appear in the incoming data stream. We show that applying ER causes the representations of newly added classes to overlap significantly with the previous classes, leading to highly disruptive parameter updates. To mitigate this issue, we propose a new method that shields the learned representations from drastic adaptation to accommodate new classes. Specifically, we use an asymmetric update rule that pushes new classes to adapt to the older ones, which is more effective, especially at task boundaries where much of the forgetting typically occurs. Empirical results on standard continual learning benchmarks show significant gains over strong baselines.

Then, we focus on the concept of representation forgetting, which refers to the change in a model's representation without losing knowledge about prior tasks. We observe that models trained without any explicit control for forgetting often experience small representation forgetting, which can sometimes be comparable to methods that explicitly control for forgetting, especially in longer task sequences. We propose a simple yet competitive approach to learning representations continually with standard supervised contrastive learning while constructing prototypes of class samples when queried on old samples. We show that this approach can lead to new insights on the effect of model capacity and loss function used in continual learning.

Finally, we address the challenge of balancing effective adaptation while combating catastrophic forgetting, i.e. stability-plasticity dilemma, without relying on prior task data. We propose a holistic approach to jointly learn the representation and class prototypes while maintaining the relevance of old class prototypes and their embedded similarities. We use a supervised contrastive loss to learn representations in an embedding space and evolve class prototypes continually in the same latent space, enabling learning and prediction at any point. To continually adapt the prototypes without keeping any prior task data, we propose a novel distillation loss that constrains class prototypes to maintain relative similarities as compared to new task data. Empirical results show that this method yields state-of-the-art performance in the task-incremental setting and provides strong performance in the class-incremental setting without using any stored data points.

Overall, in this thesis, we provide new insights and methods for effective adaptation in CL without catastrophic forgetting. The proposed methods achieve state-of-the-art performance on standard continual learning benchmarks and provide new insights into the role of model capacity, loss functions, and forgetting in CL.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Asadi, Nader
Institution:Concordia University
Degree Name:M. Sc.
Program:Computer Science
Date:30 March 2023
Thesis Supervisor(s):Mudur, Sudhir and Belilovsky, Eugene
ID Code:992121
Deposited By: Nader Asadi
Deposited On:21 Jun 2023 14:42
Last Modified:21 Jun 2023 14:42
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top