Dynamic Slimmable Network for Speech Separation

Elminshawi M, Chetupalli SR, Habets E (2024)

Publication Type: Journal article

Publication year: 2024

Journal

IEEE Signal Processing Letters Institute of Electrical and Electronics Engineers (IEEE)

Book Volume: 31

Pages Range: 2205-2209

DOI: 10.1109/LSP.2024.3445304

Abstract

Neural networks for speech separation generally exhibit high computational costs and large memory footprints. Moreover, typical separation networks have a fixed computational graph that processes all input frames at a uniform computational cost, even though intensive processing may not be necessary for frames containing silence or a single active speaker. Addressing this computational inefficiency is especially crucial when these networks are deployed on resource-constrained devices. In this letter, we propose a dynamic slimmable network for speech separation that mitigates the computational inefficiency of existing networks. We introduce slimmable layers with a gating mechanism that can adapt their computational complexity based on the input characteristics. As an example, we propose to use the slimmable layers in the intra-chunk blocks of a dual-path structure-based network to facilitate adaptation based on the local characteristics of the input signal. Experimental evaluation on simulated two-speaker mixtures from the WSJ0-2mix dataset demonstrates that the proposed method substantially reduces the computational cost while maintaining comparable performance to fully utilized static networks.

Authors with CRIS profile

Mohamed Elminshawi International Audio Laboratories Erlangen (AudioLabs) Emanuël Habets Lehrstuhl für Sprach- und Akustische Signalverarbeitung

How to cite

APA:

Elminshawi, M., Chetupalli, S.R., & Habets, E. (2024). Dynamic Slimmable Network for Speech Separation. IEEE Signal Processing Letters, 31, 2205-2209. https://doi.org/10.1109/LSP.2024.3445304

MLA:

Elminshawi, Mohamed, Srikanth Raj Chetupalli, and Emanuël Habets. "Dynamic Slimmable Network for Speech Separation." IEEE Signal Processing Letters 31 (2024): 2205-2209.

BibTeX: Download