Assessing the effectiveness of transfer learning strategies in BLSTM networks for speech fenoising

Denoising speech signals represent a challenging task due to the increasing number of applications and technologies currently implemented in communication and portable devices. In those applications, challenging environmental conditions such as background noise, reverberation, and other sound artifa...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Tecnología en marcha 2022-11
Hauptverfasser: Marvin Coto-Jiménez, Astryd González-Salazar, Michelle Gutiérrez-Muñoz
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title Tecnología en marcha
container_volume
creator Marvin Coto-Jiménez
Astryd González-Salazar
Michelle Gutiérrez-Muñoz
description Denoising speech signals represent a challenging task due to the increasing number of applications and technologies currently implemented in communication and portable devices. In those applications, challenging environmental conditions such as background noise, reverberation, and other sound artifacts can affect the quality of the signals. As a result, it also impacts the systems for speech recognition, speaker identification, and sound source localization, among many others. For denoising the speech signals degraded with the many kinds and possibly different levels of noise, several algorithms have been proposed during the past decades, with recent proposals based on deep learning presented as state-of-the-art, in particular those based on Long Short-Term Memory Networks (LSTM and Bidirectional-LSMT). In this work, a comparative study on different transfer learning strategies for reducing training time and increase the effectiveness of this kind of network is presented. The reduction in training time is one of the most critical challenges due to the high computational cost of training LSTM and BLSTM. Those strategies arose from the different options to initialize the networks, using clean or noisy information of several types. Results show the convenience of transferring information from a single case of denoising network to the rest, with a significant reduction in training time and denoising capabilities of the BLSTM networks.
doi_str_mv 10.18845/tm.v35i8.6448
format Article
fullrecord <record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_18845_tm_v35i8_6448</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_18845_tm_v35i8_6448</sourcerecordid><originalsourceid>FETCH-crossref_primary_10_18845_tm_v35i8_64483</originalsourceid><addsrcrecordid>eNqVj8FuwjAQRK2KSo0o1573BxJiOyHOsa2KOMCp3K0QrcEqsdGuRdW_b4L4AU4jPc2M9IR4k2UhjanqZRqKq669KVZVZZ5EppSsc60qORNZqZs2161RL2LB7A-lWrWNkqrJxOGdGUcWjpBOCOgc9slfMYwQooNEXWCHBGfsKEw1HlHCo0cGH-Bj-73fQcD0G-mHwUUCviD2J3AYop-OX8Wz686Mi3vORbH-2n9u8p4iM6GzF_JDR39WlvYmY9NgbzJ2ktEPD_4B9B5U5w</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Assessing the effectiveness of transfer learning strategies in BLSTM networks for speech fenoising</title><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Marvin Coto-Jiménez ; Astryd González-Salazar ; Michelle Gutiérrez-Muñoz</creator><creatorcontrib>Marvin Coto-Jiménez ; Astryd González-Salazar ; Michelle Gutiérrez-Muñoz</creatorcontrib><description>Denoising speech signals represent a challenging task due to the increasing number of applications and technologies currently implemented in communication and portable devices. In those applications, challenging environmental conditions such as background noise, reverberation, and other sound artifacts can affect the quality of the signals. As a result, it also impacts the systems for speech recognition, speaker identification, and sound source localization, among many others. For denoising the speech signals degraded with the many kinds and possibly different levels of noise, several algorithms have been proposed during the past decades, with recent proposals based on deep learning presented as state-of-the-art, in particular those based on Long Short-Term Memory Networks (LSTM and Bidirectional-LSMT). In this work, a comparative study on different transfer learning strategies for reducing training time and increase the effectiveness of this kind of network is presented. The reduction in training time is one of the most critical challenges due to the high computational cost of training LSTM and BLSTM. Those strategies arose from the different options to initialize the networks, using clean or noisy information of several types. Results show the convenience of transferring information from a single case of denoising network to the rest, with a significant reduction in training time and denoising capabilities of the BLSTM networks.</description><identifier>ISSN: 0379-3982</identifier><identifier>EISSN: 2215-3241</identifier><identifier>DOI: 10.18845/tm.v35i8.6448</identifier><language>eng</language><ispartof>Tecnología en marcha, 2022-11</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,864,27924,27925</link.rule.ids></links><search><creatorcontrib>Marvin Coto-Jiménez</creatorcontrib><creatorcontrib>Astryd González-Salazar</creatorcontrib><creatorcontrib>Michelle Gutiérrez-Muñoz</creatorcontrib><title>Assessing the effectiveness of transfer learning strategies in BLSTM networks for speech fenoising</title><title>Tecnología en marcha</title><description>Denoising speech signals represent a challenging task due to the increasing number of applications and technologies currently implemented in communication and portable devices. In those applications, challenging environmental conditions such as background noise, reverberation, and other sound artifacts can affect the quality of the signals. As a result, it also impacts the systems for speech recognition, speaker identification, and sound source localization, among many others. For denoising the speech signals degraded with the many kinds and possibly different levels of noise, several algorithms have been proposed during the past decades, with recent proposals based on deep learning presented as state-of-the-art, in particular those based on Long Short-Term Memory Networks (LSTM and Bidirectional-LSMT). In this work, a comparative study on different transfer learning strategies for reducing training time and increase the effectiveness of this kind of network is presented. The reduction in training time is one of the most critical challenges due to the high computational cost of training LSTM and BLSTM. Those strategies arose from the different options to initialize the networks, using clean or noisy information of several types. Results show the convenience of transferring information from a single case of denoising network to the rest, with a significant reduction in training time and denoising capabilities of the BLSTM networks.</description><issn>0379-3982</issn><issn>2215-3241</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNqVj8FuwjAQRK2KSo0o1573BxJiOyHOsa2KOMCp3K0QrcEqsdGuRdW_b4L4AU4jPc2M9IR4k2UhjanqZRqKq669KVZVZZ5EppSsc60qORNZqZs2161RL2LB7A-lWrWNkqrJxOGdGUcWjpBOCOgc9slfMYwQooNEXWCHBGfsKEw1HlHCo0cGH-Bj-73fQcD0G-mHwUUCviD2J3AYop-OX8Wz686Mi3vORbH-2n9u8p4iM6GzF_JDR39WlvYmY9NgbzJ2ktEPD_4B9B5U5w</recordid><startdate>20221116</startdate><enddate>20221116</enddate><creator>Marvin Coto-Jiménez</creator><creator>Astryd González-Salazar</creator><creator>Michelle Gutiérrez-Muñoz</creator><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20221116</creationdate><title>Assessing the effectiveness of transfer learning strategies in BLSTM networks for speech fenoising</title><author>Marvin Coto-Jiménez ; Astryd González-Salazar ; Michelle Gutiérrez-Muñoz</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-crossref_primary_10_18845_tm_v35i8_64483</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Marvin Coto-Jiménez</creatorcontrib><creatorcontrib>Astryd González-Salazar</creatorcontrib><creatorcontrib>Michelle Gutiérrez-Muñoz</creatorcontrib><collection>CrossRef</collection><jtitle>Tecnología en marcha</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Marvin Coto-Jiménez</au><au>Astryd González-Salazar</au><au>Michelle Gutiérrez-Muñoz</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Assessing the effectiveness of transfer learning strategies in BLSTM networks for speech fenoising</atitle><jtitle>Tecnología en marcha</jtitle><date>2022-11-16</date><risdate>2022</risdate><issn>0379-3982</issn><eissn>2215-3241</eissn><abstract>Denoising speech signals represent a challenging task due to the increasing number of applications and technologies currently implemented in communication and portable devices. In those applications, challenging environmental conditions such as background noise, reverberation, and other sound artifacts can affect the quality of the signals. As a result, it also impacts the systems for speech recognition, speaker identification, and sound source localization, among many others. For denoising the speech signals degraded with the many kinds and possibly different levels of noise, several algorithms have been proposed during the past decades, with recent proposals based on deep learning presented as state-of-the-art, in particular those based on Long Short-Term Memory Networks (LSTM and Bidirectional-LSMT). In this work, a comparative study on different transfer learning strategies for reducing training time and increase the effectiveness of this kind of network is presented. The reduction in training time is one of the most critical challenges due to the high computational cost of training LSTM and BLSTM. Those strategies arose from the different options to initialize the networks, using clean or noisy information of several types. Results show the convenience of transferring information from a single case of denoising network to the rest, with a significant reduction in training time and denoising capabilities of the BLSTM networks.</abstract><doi>10.18845/tm.v35i8.6448</doi></addata></record>
fulltext fulltext
identifier ISSN: 0379-3982
ispartof Tecnología en marcha, 2022-11
issn 0379-3982
2215-3241
language eng
recordid cdi_crossref_primary_10_18845_tm_v35i8_6448
source DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
title Assessing the effectiveness of transfer learning strategies in BLSTM networks for speech fenoising
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T20%3A55%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Assessing%20the%20effectiveness%20of%20transfer%20learning%20strategies%20in%20BLSTM%20networks%20for%20speech%20fenoising&rft.jtitle=Tecnolog%C3%ADa%20en%20marcha&rft.au=Marvin%20Coto-Jim%C3%A9nez&rft.date=2022-11-16&rft.issn=0379-3982&rft.eissn=2215-3241&rft_id=info:doi/10.18845/tm.v35i8.6448&rft_dat=%3Ccrossref%3E10_18845_tm_v35i8_6448%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true