DNN-Based Voice Activity Detection with Multi-Task Learning
Recently, notable improvements in voice activity detection (VAD) problem have been achieved by adopting several machine learning techniques. Among them, the deep neural network (DNN) which learns the mapping between the noisy speech features and the corresponding voice activity status with its deep...
Gespeichert in:
Veröffentlicht in: | IEICE Transactions on Information and Systems 2016/02/01, Vol.E99.D(2), pp.550-553 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 553 |
---|---|
container_issue | 2 |
container_start_page | 550 |
container_title | IEICE Transactions on Information and Systems |
container_volume | E99.D |
creator | KANG, Tae Gyoon KIM, Nam Soo |
description | Recently, notable improvements in voice activity detection (VAD) problem have been achieved by adopting several machine learning techniques. Among them, the deep neural network (DNN) which learns the mapping between the noisy speech features and the corresponding voice activity status with its deep hidden structure has been one of the most popular techniques. In this letter, we propose a novel approach which enhances the robustness of DNN in mismatched noise conditions with multi-task learning (MTL) framework. In the proposed algorithm, a feature enhancement task for speech features is jointly trained with the conventional VAD task. The experimental results show that the DNN with the proposed framework outperforms the conventional DNN-based VAD algorithm. |
doi_str_mv | 10.1587/transinf.2015EDL8168 |
format | Article |
fullrecord | <record><control><sourceid>jstage_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1587_transinf_2015EDL8168</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>article_transinf_E99_D_2_E99_D_2015EDL8168_article_char_en</sourcerecordid><originalsourceid>FETCH-LOGICAL-c436t-d254c1384a97c23d319be5032faa44268e8635ce8d99c54bddd78f94f0ff37e03</originalsourceid><addsrcrecordid>eNpNkMtOAjEARRujiSP6By7mB4p9zrRxhQw-khFdoNum9AFFHExbNfy9GARc3bu45y4OAJcY9TEX9VWOukuh832CMB81rcCVOAIFrhmHmFb4GBRI4goKTskpOEtpgRAWBPMCXDfjMbzRydnydRWMKwcmh6-Q12Xjstv0VVd-hzwvHz-XOcCJTm9l63TsQjc7BydeL5O7-MseeLkdTYb3sH26exgOWmgYrTK0hDODqWBa1oZQS7GcOo4o8VozRirhREW5ccJKaTibWmtr4SXzyHtaO0R7gG1_TVylFJ1XHzG867hWGKlfAWonQP0TsMGet9giZT1ze0jHHMzSHaCRlKpRZJeHi_3UzHVUrqM_XNRtLQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>DNN-Based Voice Activity Detection with Multi-Task Learning</title><source>J-STAGE (Japan Science & Technology Information Aggregator, Electronic) Freely Available Titles - Japanese</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>KANG, Tae Gyoon ; KIM, Nam Soo</creator><creatorcontrib>KANG, Tae Gyoon ; KIM, Nam Soo</creatorcontrib><description>Recently, notable improvements in voice activity detection (VAD) problem have been achieved by adopting several machine learning techniques. Among them, the deep neural network (DNN) which learns the mapping between the noisy speech features and the corresponding voice activity status with its deep hidden structure has been one of the most popular techniques. In this letter, we propose a novel approach which enhances the robustness of DNN in mismatched noise conditions with multi-task learning (MTL) framework. In the proposed algorithm, a feature enhancement task for speech features is jointly trained with the conventional VAD task. The experimental results show that the DNN with the proposed framework outperforms the conventional DNN-based VAD algorithm.</description><identifier>ISSN: 0916-8532</identifier><identifier>EISSN: 1745-1361</identifier><identifier>DOI: 10.1587/transinf.2015EDL8168</identifier><language>eng</language><publisher>The Institute of Electronics, Information and Communication Engineers</publisher><subject>deep neural network ; multi-task learning ; voice activity detection</subject><ispartof>IEICE Transactions on Information and Systems, 2016/02/01, Vol.E99.D(2), pp.550-553</ispartof><rights>2016 The Institute of Electronics, Information and Communication Engineers</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c436t-d254c1384a97c23d319be5032faa44268e8635ce8d99c54bddd78f94f0ff37e03</citedby><cites>FETCH-LOGICAL-c436t-d254c1384a97c23d319be5032faa44268e8635ce8d99c54bddd78f94f0ff37e03</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1883,27924,27925</link.rule.ids></links><search><creatorcontrib>KANG, Tae Gyoon</creatorcontrib><creatorcontrib>KIM, Nam Soo</creatorcontrib><title>DNN-Based Voice Activity Detection with Multi-Task Learning</title><title>IEICE Transactions on Information and Systems</title><addtitle>IEICE Trans. Inf. & Syst.</addtitle><description>Recently, notable improvements in voice activity detection (VAD) problem have been achieved by adopting several machine learning techniques. Among them, the deep neural network (DNN) which learns the mapping between the noisy speech features and the corresponding voice activity status with its deep hidden structure has been one of the most popular techniques. In this letter, we propose a novel approach which enhances the robustness of DNN in mismatched noise conditions with multi-task learning (MTL) framework. In the proposed algorithm, a feature enhancement task for speech features is jointly trained with the conventional VAD task. The experimental results show that the DNN with the proposed framework outperforms the conventional DNN-based VAD algorithm.</description><subject>deep neural network</subject><subject>multi-task learning</subject><subject>voice activity detection</subject><issn>0916-8532</issn><issn>1745-1361</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNpNkMtOAjEARRujiSP6By7mB4p9zrRxhQw-khFdoNum9AFFHExbNfy9GARc3bu45y4OAJcY9TEX9VWOukuh832CMB81rcCVOAIFrhmHmFb4GBRI4goKTskpOEtpgRAWBPMCXDfjMbzRydnydRWMKwcmh6-Q12Xjstv0VVd-hzwvHz-XOcCJTm9l63TsQjc7BydeL5O7-MseeLkdTYb3sH26exgOWmgYrTK0hDODqWBa1oZQS7GcOo4o8VozRirhREW5ccJKaTibWmtr4SXzyHtaO0R7gG1_TVylFJ1XHzG867hWGKlfAWonQP0TsMGet9giZT1ze0jHHMzSHaCRlKpRZJeHi_3UzHVUrqM_XNRtLQ</recordid><startdate>20160201</startdate><enddate>20160201</enddate><creator>KANG, Tae Gyoon</creator><creator>KIM, Nam Soo</creator><general>The Institute of Electronics, Information and Communication Engineers</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20160201</creationdate><title>DNN-Based Voice Activity Detection with Multi-Task Learning</title><author>KANG, Tae Gyoon ; KIM, Nam Soo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c436t-d254c1384a97c23d319be5032faa44268e8635ce8d99c54bddd78f94f0ff37e03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>deep neural network</topic><topic>multi-task learning</topic><topic>voice activity detection</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>KANG, Tae Gyoon</creatorcontrib><creatorcontrib>KIM, Nam Soo</creatorcontrib><collection>CrossRef</collection><jtitle>IEICE Transactions on Information and Systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>KANG, Tae Gyoon</au><au>KIM, Nam Soo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DNN-Based Voice Activity Detection with Multi-Task Learning</atitle><jtitle>IEICE Transactions on Information and Systems</jtitle><addtitle>IEICE Trans. Inf. & Syst.</addtitle><date>2016-02-01</date><risdate>2016</risdate><volume>E99.D</volume><issue>2</issue><spage>550</spage><epage>553</epage><pages>550-553</pages><issn>0916-8532</issn><eissn>1745-1361</eissn><abstract>Recently, notable improvements in voice activity detection (VAD) problem have been achieved by adopting several machine learning techniques. Among them, the deep neural network (DNN) which learns the mapping between the noisy speech features and the corresponding voice activity status with its deep hidden structure has been one of the most popular techniques. In this letter, we propose a novel approach which enhances the robustness of DNN in mismatched noise conditions with multi-task learning (MTL) framework. In the proposed algorithm, a feature enhancement task for speech features is jointly trained with the conventional VAD task. The experimental results show that the DNN with the proposed framework outperforms the conventional DNN-based VAD algorithm.</abstract><pub>The Institute of Electronics, Information and Communication Engineers</pub><doi>10.1587/transinf.2015EDL8168</doi><tpages>4</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0916-8532 |
ispartof | IEICE Transactions on Information and Systems, 2016/02/01, Vol.E99.D(2), pp.550-553 |
issn | 0916-8532 1745-1361 |
language | eng |
recordid | cdi_crossref_primary_10_1587_transinf_2015EDL8168 |
source | J-STAGE (Japan Science & Technology Information Aggregator, Electronic) Freely Available Titles - Japanese; EZB-FREE-00999 freely available EZB journals |
subjects | deep neural network multi-task learning voice activity detection |
title | DNN-Based Voice Activity Detection with Multi-Task Learning |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T11%3A26%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstage_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DNN-Based%20Voice%20Activity%20Detection%20with%20Multi-Task%20Learning&rft.jtitle=IEICE%20Transactions%20on%20Information%20and%20Systems&rft.au=KANG,%20Tae%20Gyoon&rft.date=2016-02-01&rft.volume=E99.D&rft.issue=2&rft.spage=550&rft.epage=553&rft.pages=550-553&rft.issn=0916-8532&rft.eissn=1745-1361&rft_id=info:doi/10.1587/transinf.2015EDL8168&rft_dat=%3Cjstage_cross%3Earticle_transinf_E99_D_2_E99_D_2015EDL8168_article_char_en%3C/jstage_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |