SepMAMBAM: A Compute-Efficient, High-Performance Mamba-Based Network for Speech Separation

Title:

SepMAMBAM: A Compute-Efficient, High-Performance Mamba-Based Network for Speech Separation

Soltani, Hamid (2025) SepMAMBAM: A Compute-Efficient, High-Performance Mamba-Based Network for Speech Separation. Masters thesis, Concordia University.

Text (application/pdf)
Soltani_MASc_S2026.pdf - Accepted Version
Restricted to Repository staff only until 1 January 2028.
Available under License Spectrum Terms of Access.

3MB

Abstract

Speech separation aims to extract individual sources from a single-channel mixture, a task that is fundamental for applications such as automatic speech recognition, hearing aids, and teleconferencing systems. Despite considerable progress, existing deep learning approaches face notable challenges. Recurrent and convolutional architectures often struggle to capture long-range temporal dependencies, while attention-based models such as Transformers, though effective, typically introduce high computational costs that limit their scalability and practicality. To address these limitations, we propose SepMAMBAM, an early split separator that integrates the Mamba state-space
model with convolutional block attention modules (CBAM). In our design, the encoder is composed of UMC (Unidirectional Mamba + CBAM) blocks that extract and refine mixture representations, while the decoder employs BMC (Bidirectional Mamba + CBAM) blocks to capture long-range dependencies and enhance separated streams through forward–backward context. This design enables efficient modeling of temporal dynamics while preserving clear separation between speakers. Experimental evaluation on the WSJ0-2Mix (clean) and WHAM! (noisy) datasets demonstrates that SepMAMBAM outperforms state-of-the-art SSM-based baselines and achieves competitive separation performance compared to leading Transformer-based models, while requiring substantially fewer parameters and lower computational cost. These results highlight SepMAMBAM as a promising framework for advancing speech separation research, particularly for deployment in resource-constrained or real-time scenarios.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering
Item Type:	Thesis (Masters)
Authors:	Soltani, Hamid
Institution:	Concordia University
Degree Name:	M.A. Sc.
Program:	Electrical and Computer Engineering
Date:	November 2025
Thesis Supervisor(s):	Zhu, Wei-Ping and Ahmad, M. Omair
ID Code:	996657
Deposited By:	Hamid Soltani
Deposited On:	29 Jun 2026 14:42
Last Modified:	29 Jun 2026 14:42

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

SepMAMBAM: A Compute-Efficient, High-Performance Mamba-Based Network for Speech Separation

SepMAMBAM: A Compute-Efficient, High-Performance Mamba-Based Network for Speech Separation

Abstract