Efficient spoken term discovery using randomized algorithms

Spoken term discovery is the task of automatically identifying words and phrases in speech data by searching for long repeated acoustic patterns. Initial solutions relied on exhaustive dynamic time warping-based searches across the entire similarity matrix, a method whose scalability is ultimately l...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Jansen, A., Van Durme, B.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 406
container_issue
container_start_page 401
container_title
container_volume
creator Jansen, A.
Van Durme, B.
description Spoken term discovery is the task of automatically identifying words and phrases in speech data by searching for long repeated acoustic patterns. Initial solutions relied on exhaustive dynamic time warping-based searches across the entire similarity matrix, a method whose scalability is ultimately limited by the O(n 2 ) nature of the search space. Recent strategies have attempted to improve search efficiency by using either unsupervised or mismatched-language acoustic models to reduce the complexity of the feature representation. Taking a completely different approach, this paper investigates the use of randomized algorithms that operate directly on the raw acoustic features to produce sparse approximate similarity matrices in O(n) space and O(n log n) time. We demonstrate these techniques facilitate spoken term discovery performance capable of outperforming a model-based strategy in the zero resource setting.
doi_str_mv 10.1109/ASRU.2011.6163965
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6163965</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6163965</ieee_id><sourcerecordid>6163965</sourcerecordid><originalsourceid>FETCH-LOGICAL-c138t-35e8bdb5ca69768c6bcf33ba99cce69f2b083f3e61cf6b55ba7faf43502e77a53</originalsourceid><addsrcrecordid>eNo1j8FKAzEURSMiqHU-QNzMD8yYzGteElyVUqtQENSuS5J5qdHOTElGoX69gvVuDmdz4DJ2LXgtBDe3s5fndd1wIWoUCAblCSuM0mKKCjigkqfs8l-kPmdFzu_8d4haoblgd4sQoo_Uj2XeDx_UlyOlrmxj9sMXpUP5mWO_LZPt26GL39SWdrcdUhzfunzFzoLdZSqOnLD1_eJ1_lCtnpaP89mq8gL0WIEk7VonvUWjUHt0PgA4a4z3hCY0jmsIQCh8QCelsyrYMAXJG1LKSpiwm79uJKLNPsXOpsPm-Bd-AOx-SmY</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Efficient spoken term discovery using randomized algorithms</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Jansen, A. ; Van Durme, B.</creator><creatorcontrib>Jansen, A. ; Van Durme, B.</creatorcontrib><description>Spoken term discovery is the task of automatically identifying words and phrases in speech data by searching for long repeated acoustic patterns. Initial solutions relied on exhaustive dynamic time warping-based searches across the entire similarity matrix, a method whose scalability is ultimately limited by the O(n 2 ) nature of the search space. Recent strategies have attempted to improve search efficiency by using either unsupervised or mismatched-language acoustic models to reduce the complexity of the feature representation. Taking a completely different approach, this paper investigates the use of randomized algorithms that operate directly on the raw acoustic features to produce sparse approximate similarity matrices in O(n) space and O(n log n) time. We demonstrate these techniques facilitate spoken term discovery performance capable of outperforming a model-based strategy in the zero resource setting.</description><identifier>ISBN: 1467303658</identifier><identifier>ISBN: 9781467303651</identifier><identifier>EISBN: 9781467303675</identifier><identifier>EISBN: 1467303666</identifier><identifier>EISBN: 9781467303668</identifier><identifier>EISBN: 1467303674</identifier><identifier>DOI: 10.1109/ASRU.2011.6163965</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acoustics ; Approximation algorithms ; Approximation methods ; Image segmentation ; Sparse matrices ; Speech ; Vectors</subject><ispartof>2011 IEEE Workshop on Automatic Speech Recognition &amp; Understanding, 2011, p.401-406</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c138t-35e8bdb5ca69768c6bcf33ba99cce69f2b083f3e61cf6b55ba7faf43502e77a53</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6163965$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,777,781,786,787,2052,27906,54901</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6163965$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Jansen, A.</creatorcontrib><creatorcontrib>Van Durme, B.</creatorcontrib><title>Efficient spoken term discovery using randomized algorithms</title><title>2011 IEEE Workshop on Automatic Speech Recognition &amp; Understanding</title><addtitle>ASRU</addtitle><description>Spoken term discovery is the task of automatically identifying words and phrases in speech data by searching for long repeated acoustic patterns. Initial solutions relied on exhaustive dynamic time warping-based searches across the entire similarity matrix, a method whose scalability is ultimately limited by the O(n 2 ) nature of the search space. Recent strategies have attempted to improve search efficiency by using either unsupervised or mismatched-language acoustic models to reduce the complexity of the feature representation. Taking a completely different approach, this paper investigates the use of randomized algorithms that operate directly on the raw acoustic features to produce sparse approximate similarity matrices in O(n) space and O(n log n) time. We demonstrate these techniques facilitate spoken term discovery performance capable of outperforming a model-based strategy in the zero resource setting.</description><subject>Acoustics</subject><subject>Approximation algorithms</subject><subject>Approximation methods</subject><subject>Image segmentation</subject><subject>Sparse matrices</subject><subject>Speech</subject><subject>Vectors</subject><isbn>1467303658</isbn><isbn>9781467303651</isbn><isbn>9781467303675</isbn><isbn>1467303666</isbn><isbn>9781467303668</isbn><isbn>1467303674</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2011</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNo1j8FKAzEURSMiqHU-QNzMD8yYzGteElyVUqtQENSuS5J5qdHOTElGoX69gvVuDmdz4DJ2LXgtBDe3s5fndd1wIWoUCAblCSuM0mKKCjigkqfs8l-kPmdFzu_8d4haoblgd4sQoo_Uj2XeDx_UlyOlrmxj9sMXpUP5mWO_LZPt26GL39SWdrcdUhzfunzFzoLdZSqOnLD1_eJ1_lCtnpaP89mq8gL0WIEk7VonvUWjUHt0PgA4a4z3hCY0jmsIQCh8QCelsyrYMAXJG1LKSpiwm79uJKLNPsXOpsPm-Bd-AOx-SmY</recordid><startdate>201112</startdate><enddate>201112</enddate><creator>Jansen, A.</creator><creator>Van Durme, B.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201112</creationdate><title>Efficient spoken term discovery using randomized algorithms</title><author>Jansen, A. ; Van Durme, B.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c138t-35e8bdb5ca69768c6bcf33ba99cce69f2b083f3e61cf6b55ba7faf43502e77a53</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Acoustics</topic><topic>Approximation algorithms</topic><topic>Approximation methods</topic><topic>Image segmentation</topic><topic>Sparse matrices</topic><topic>Speech</topic><topic>Vectors</topic><toplevel>online_resources</toplevel><creatorcontrib>Jansen, A.</creatorcontrib><creatorcontrib>Van Durme, B.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jansen, A.</au><au>Van Durme, B.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Efficient spoken term discovery using randomized algorithms</atitle><btitle>2011 IEEE Workshop on Automatic Speech Recognition &amp; Understanding</btitle><stitle>ASRU</stitle><date>2011-12</date><risdate>2011</risdate><spage>401</spage><epage>406</epage><pages>401-406</pages><isbn>1467303658</isbn><isbn>9781467303651</isbn><eisbn>9781467303675</eisbn><eisbn>1467303666</eisbn><eisbn>9781467303668</eisbn><eisbn>1467303674</eisbn><abstract>Spoken term discovery is the task of automatically identifying words and phrases in speech data by searching for long repeated acoustic patterns. Initial solutions relied on exhaustive dynamic time warping-based searches across the entire similarity matrix, a method whose scalability is ultimately limited by the O(n 2 ) nature of the search space. Recent strategies have attempted to improve search efficiency by using either unsupervised or mismatched-language acoustic models to reduce the complexity of the feature representation. Taking a completely different approach, this paper investigates the use of randomized algorithms that operate directly on the raw acoustic features to produce sparse approximate similarity matrices in O(n) space and O(n log n) time. We demonstrate these techniques facilitate spoken term discovery performance capable of outperforming a model-based strategy in the zero resource setting.</abstract><pub>IEEE</pub><doi>10.1109/ASRU.2011.6163965</doi><tpages>6</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISBN: 1467303658
ispartof 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011, p.401-406
issn
language eng
recordid cdi_ieee_primary_6163965
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Acoustics
Approximation algorithms
Approximation methods
Image segmentation
Sparse matrices
Speech
Vectors
title Efficient spoken term discovery using randomized algorithms
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T18%3A03%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Efficient%20spoken%20term%20discovery%20using%20randomized%20algorithms&rft.btitle=2011%20IEEE%20Workshop%20on%20Automatic%20Speech%20Recognition%20&%20Understanding&rft.au=Jansen,%20A.&rft.date=2011-12&rft.spage=401&rft.epage=406&rft.pages=401-406&rft.isbn=1467303658&rft.isbn_list=9781467303651&rft_id=info:doi/10.1109/ASRU.2011.6163965&rft_dat=%3Cieee_6IE%3E6163965%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9781467303675&rft.eisbn_list=1467303666&rft.eisbn_list=9781467303668&rft.eisbn_list=1467303674&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6163965&rfr_iscdi=true