Paper
31 December 2019 Frame-level speech enhancement based on Wasserstein GAN
Author Affiliations +
Proceedings Volume 11384, Eleventh International Conference on Signal Processing Systems; 113840G (2019) https://doi.org/10.1117/12.2559619
Event: Eleventh International Conference on Signal Processing Systems, 2019, Chengdu, China
Abstract
Speech enhancement is a challenging and critical task in the speech processing research area. In this paper, we propose a novel speech enhancement model based on Wasserstein generative adversarial networks, called WSEM. The proposed model operates on frame-level speech segments by using an adjacent frames extension mechanism, to enforce the mapping from noisy speech to the clean target, which makes it distinctly different from other related GAN-based models. We compare the performance of WSEM with related works on benchmark datasets under different signal-to-noise (SNR) conditions, experimental results show that WSEM performs comparable to the state-of-the-art approaches in all the tests, and it performs especially well in low SNR environments.
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Peng Chuan, Tian Lan, Meng Li, Sen Li, and Qiao Liu "Frame-level speech enhancement based on Wasserstein GAN", Proc. SPIE 11384, Eleventh International Conference on Signal Processing Systems, 113840G (31 December 2019); https://doi.org/10.1117/12.2559619
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Signal to noise ratio

Gallium nitride

Performance modeling

Data modeling

Neural networks

Image filtering

Signal processing

RELATED CONTENT


Back to Top