N-HANS: A neural network-based toolkit for in-the-wild audio enhancement

The unprecedented growth of noise pollution over the last decades has raised an always increasing need for developing efficient audio enhancement technologies. Yet, the variety of difficulties related to processing audio sources in-the-wild, such as handling unseen noises or suppressing specific int...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Multimedia tools and applications 2021-07, Vol.80 (18), p.28365-28389
Hauptverfasser:	Liu, Shuo, Keren, Gil, Parada-Cabaleiro, Emilia, Schuller, Björn
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Communication Networks Computer Science Data Structures and Information Theory Line interfaces Multimedia Information Systems Neural networks Noise Noise pollution Noise reduction Special Purpose and Application-Based Systems Speech processing Toolkits
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The unprecedented growth of noise pollution over the last decades has raised an always increasing need for developing efficient audio enhancement technologies. Yet, the variety of difficulties related to processing audio sources in-the-wild, such as handling unseen noises or suppressing specific interferences, makes audio enhancement a still open challenge. In this regard, we present N-HANS (the Neuro-Holistic Audio-eNhancement System), a Python toolkit for in-the-wild audio enhancement that includes functionalities for audio denoising, source separation, and —for the first time in such a toolkit—selective noise suppression. The N-HANS architecture is specially developed to automatically adapt to different environmental backgrounds and speakers. This is achieved by the use of two identical neural networks comprised of stacks of residual blocks, each conditioned on additional speech- and noise-based recordings through auxiliary sub-networks. Along to a Python API, a command line interface is provided to researchers and developers, both of them carefully documented. Experimental results indicate that N-HANS achieves great performance w. r. t. existing methods, preserving also the audio quality at a high level; thus, ensuring a reliable usage in real-life application, e. g., for in-the-wild speech processing, which encourages the development of speech-based intelligent technology.
ISSN:	1380-7501 1573-7721
DOI:	10.1007/s11042-021-11080-y