Login | Register

Narrative Signals from Bank Filings as Early-Warning Indicators: Unsupervised NLP with Regime-Switching Models

Title:

Narrative Signals from Bank Filings as Early-Warning Indicators: Unsupervised NLP with Regime-Switching Models

Roustaei, Javad (2025) Narrative Signals from Bank Filings as Early-Warning Indicators: Unsupervised NLP with Regime-Switching Models. Masters thesis, Concordia University.

[thumbnail of Roustaei_MSc_F2025.pdf]
Preview
Text (application/pdf)
Roustaei_MSc_F2025.pdf - Accepted Version
Available under License Spectrum Terms of Access.
3MB

Abstract

Financial distress in banks often surfaces first as shifts in narrative tone and content within
mandated disclosures. This thesis studies whether unsupervised Natural Language Processing(NLP) features extracted from U.S. bank 10-K/10-Q filings can anticipate transitions
into high-risk regimes. We build filing-level signals from (i) dictionary-based sentiment
with negation/intensity handling, (ii) topic-mixture drift measured by Jensen–Shannon
divergence, and (iii) section-focused embedding clusters (MD&A and Risk Factors) with
a cluster-change indicator. These features are integrated in a parsimonious two-state
Gaussian Hidden Markov Model (HMM) to produce a continuous distress probability per
filing.
Evaluation uses market-based forward drawdown labels (e.g., f −20% within 60–120
trading days) and emphasizes precision, recall, F1, AUC-PR, and lead time rather than raw
accuracy. Single-feature HMMs (sentiment only; cluster-change only) provide transparent
baselines. A multifeature HMM improves recall of distress episodes relative to those
baselines but can generate more false alarms. To increase actionability, we introduce a
hybrid regime–market filter that requires both an elevated HMM distress probability and
contemporaneous market stress (elevated trailing drawdown or volatility) with a short
persistence rule. This hybrid step substantially lifts precision and F1—typically by several
tens of percentage points—while retaining non-trivial lead time (often one filing) in case
studies such as Silicon Valley Bank (SVB) versus a non-failure peer Huntington Bancshares
(HBAN).
Robustness checks vary distress windows, thresholds, persistence, and regime count,
and show qualitatively stable trade-offs between sensitivity and specificity. The contribution
is a transparent, reproducible pipeline that couples unsupervised narrative signals with
regime switching and a pragmatic market confirmation step, yielding an early-warning
signal suitable for supervisory screening and risk monitoring.

Divisions:Concordia University > Faculty of Arts and Science > Mathematics and Statistics
Item Type:Thesis (Masters)
Authors:Roustaei, Javad
Institution:Concordia University
Degree Name:M. Sc.
Program:Mathematics
Date:29 August 2025
Thesis Supervisor(s):Brugiapaglia, Simone and Hyndman, Cody
ID Code:996160
Deposited By: Javad Roustaei
Deposited On:04 Nov 2025 17:11
Last Modified:04 Nov 2025 17:11
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top