A Deep Learning Model to Impute Missing Data in Time Series

Title:

A Deep Learning Model to Impute Missing Data in Time Series

Du, Wenjie (2021) A Deep Learning Model to Impute Missing Data in Time Series. Masters thesis, Concordia University.

Preview

Text (application/pdf)
Du_MASc_S2022.pdf - Accepted Version
Available under License Spectrum Terms of Access.

1MB

Abstract

Missing data in time series is a pervasive problem that puts obstacles in the way of advanced analysis. A popular solution is imputation, where the fundamental challenge is to determine what values should be filled in. In this thesis, we study imputing missing data in time series with deep learning. We first present a concrete case in telecommunication domain, where we use machine learning models to handle missing data and forecast Imminent Loss of Signal (ILOS) that is going to occur in optical networks. Subsequently, we further propose a novel model, called SAITS (Self-Attention-based Imputation for Time Series), to impute missing values in multivariate time series. SAITS uses a joint-optimization training approach to learn missing values from a weighted combination of two diagonally-masked self-attention (DMSA) blocks. DMSA explicitly captures both the temporal dependencies and feature correlations between time steps, which improves imputation accuracy and training speed. The contributions of this thesis are 1) In the motivation case, we develop a deep learning methodology based on BRITS to learn a good representation from data with massive missing values and forecast 3% ILOS with 65% precision; 2) We design a joint-optimization training approach to train self-attention models on the imputation task. Trained by this approach, Transformer achieves up to 25% smaller mean absolute error than BRITS; 3) We propose SAITS, a new imputation model based on self-attention, specifically for the time-series imputation task. Compared to the state-of-the-art (SOTA) model BRITS, SAITS obtains 12%~38% smaller mean absolute error and 2.0~2.6 times faster training speed. Experimental results demonstrate that SAITS achieves the new SOTA position on the time-series imputation task.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering
Item Type:	Thesis (Masters)
Authors:	Du, Wenjie
Institution:	Concordia University
Degree Name:	M.A. Sc.
Program:	Electrical and Computer Engineering
Date:	16 November 2021
Thesis Supervisor(s):	Liu, Yan
ID Code:	989937
Deposited By:	Wenjie Du
Deposited On:	16 Jun 2022 14:35
Last Modified:	16 Jun 2022 14:35

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

A Deep Learning Model to Impute Missing Data in Time Series

A Deep Learning Model to Impute Missing Data in Time Series

Abstract