Audiovusual automatic speech segmentation

Audiovisual speech segmentation using visual information together with audio data is introduced. The collaboration of audio and visual data results in lower average absolute boundary error between the manual segmentation and automatic segmentation results that directly affects the quality of speech...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Akdemir, E., Ciloglu, T.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 899
container_issue
container_start_page 896
container_title
container_volume
creator Akdemir, E.
Ciloglu, T.
description Audiovisual speech segmentation using visual information together with audio data is introduced. The collaboration of audio and visual data results in lower average absolute boundary error between the manual segmentation and automatic segmentation results that directly affects the quality of speech processing systems using the segmented database. The audio and visual feature vectors are fused at the feature level and used in a HMM based speech segmentation system. A Turkish audiovisual speech database has been prepared and used in the experiments. The average absolute boundary error decreases up to 20.82% by using different audiovisual feature vectors.
doi_str_mv 10.1109/SIU.2011.5929796
format Conference Proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_5929796</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5929796</ieee_id><sourcerecordid>5929796</sourcerecordid><originalsourceid>FETCH-ieee_primary_59297963</originalsourceid><addsrcrecordid>eNp9zjsLwjAYheF4A6t2F1y6OrR-X9KkzSii6KzOJdSold5oGsF_b4eCm9OB91kOIUuEABHk5ny6BhQQAy6pjKQYkBmGPIogFIwNiUOFZD4TKEY_oHzcAQrug4B4SlxjXgCAIqZMUoest_aWVW9rrMo9ZduqUG2WeqbWOn16Rj8KXbZdqsoFmdxVbrTb75ysDvvL7uhnWuukbrJCNZ-kv8b-6xcNXDat</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Audiovusual automatic speech segmentation</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Akdemir, E. ; Ciloglu, T.</creator><creatorcontrib>Akdemir, E. ; Ciloglu, T.</creatorcontrib><description>Audiovisual speech segmentation using visual information together with audio data is introduced. The collaboration of audio and visual data results in lower average absolute boundary error between the manual segmentation and automatic segmentation results that directly affects the quality of speech processing systems using the segmented database. The audio and visual feature vectors are fused at the feature level and used in a HMM based speech segmentation system. A Turkish audiovisual speech database has been prepared and used in the experiments. The average absolute boundary error decreases up to 20.82% by using different audiovisual feature vectors.</description><identifier>ISSN: 2165-0608</identifier><identifier>ISBN: 1457704625</identifier><identifier>ISBN: 9781457704628</identifier><identifier>EISSN: 2693-3616</identifier><identifier>EISBN: 1457704633</identifier><identifier>EISBN: 9781457704635</identifier><identifier>EISBN: 9781457704611</identifier><identifier>EISBN: 1457704617</identifier><identifier>DOI: 10.1109/SIU.2011.5929796</identifier><language>eng</language><publisher>IEEE</publisher><subject>Conferences ; Hidden Markov models ; Mel frequency cepstral coefficient ; Speech ; Speech processing ; Visualization</subject><ispartof>2011 IEEE 19th Signal Processing and Communications Applications Conference (SIU), 2011, p.896-899</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5929796$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5929796$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Akdemir, E.</creatorcontrib><creatorcontrib>Ciloglu, T.</creatorcontrib><title>Audiovusual automatic speech segmentation</title><title>2011 IEEE 19th Signal Processing and Communications Applications Conference (SIU)</title><addtitle>SIU</addtitle><description>Audiovisual speech segmentation using visual information together with audio data is introduced. The collaboration of audio and visual data results in lower average absolute boundary error between the manual segmentation and automatic segmentation results that directly affects the quality of speech processing systems using the segmented database. The audio and visual feature vectors are fused at the feature level and used in a HMM based speech segmentation system. A Turkish audiovisual speech database has been prepared and used in the experiments. The average absolute boundary error decreases up to 20.82% by using different audiovisual feature vectors.</description><subject>Conferences</subject><subject>Hidden Markov models</subject><subject>Mel frequency cepstral coefficient</subject><subject>Speech</subject><subject>Speech processing</subject><subject>Visualization</subject><issn>2165-0608</issn><issn>2693-3616</issn><isbn>1457704625</isbn><isbn>9781457704628</isbn><isbn>1457704633</isbn><isbn>9781457704635</isbn><isbn>9781457704611</isbn><isbn>1457704617</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2011</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNp9zjsLwjAYheF4A6t2F1y6OrR-X9KkzSii6KzOJdSold5oGsF_b4eCm9OB91kOIUuEABHk5ny6BhQQAy6pjKQYkBmGPIogFIwNiUOFZD4TKEY_oHzcAQrug4B4SlxjXgCAIqZMUoest_aWVW9rrMo9ZduqUG2WeqbWOn16Rj8KXbZdqsoFmdxVbrTb75ysDvvL7uhnWuukbrJCNZ-kv8b-6xcNXDat</recordid><startdate>201104</startdate><enddate>201104</enddate><creator>Akdemir, E.</creator><creator>Ciloglu, T.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201104</creationdate><title>Audiovusual automatic speech segmentation</title><author>Akdemir, E. ; Ciloglu, T.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-ieee_primary_59297963</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Conferences</topic><topic>Hidden Markov models</topic><topic>Mel frequency cepstral coefficient</topic><topic>Speech</topic><topic>Speech processing</topic><topic>Visualization</topic><toplevel>online_resources</toplevel><creatorcontrib>Akdemir, E.</creatorcontrib><creatorcontrib>Ciloglu, T.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Akdemir, E.</au><au>Ciloglu, T.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Audiovusual automatic speech segmentation</atitle><btitle>2011 IEEE 19th Signal Processing and Communications Applications Conference (SIU)</btitle><stitle>SIU</stitle><date>2011-04</date><risdate>2011</risdate><spage>896</spage><epage>899</epage><pages>896-899</pages><issn>2165-0608</issn><eissn>2693-3616</eissn><isbn>1457704625</isbn><isbn>9781457704628</isbn><eisbn>1457704633</eisbn><eisbn>9781457704635</eisbn><eisbn>9781457704611</eisbn><eisbn>1457704617</eisbn><abstract>Audiovisual speech segmentation using visual information together with audio data is introduced. The collaboration of audio and visual data results in lower average absolute boundary error between the manual segmentation and automatic segmentation results that directly affects the quality of speech processing systems using the segmented database. The audio and visual feature vectors are fused at the feature level and used in a HMM based speech segmentation system. A Turkish audiovisual speech database has been prepared and used in the experiments. The average absolute boundary error decreases up to 20.82% by using different audiovisual feature vectors.</abstract><pub>IEEE</pub><doi>10.1109/SIU.2011.5929796</doi></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2165-0608
ispartof 2011 IEEE 19th Signal Processing and Communications Applications Conference (SIU), 2011, p.896-899
issn 2165-0608
2693-3616
language eng
recordid cdi_ieee_primary_5929796
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Conferences
Hidden Markov models
Mel frequency cepstral coefficient
Speech
Speech processing
Visualization
title Audiovusual automatic speech segmentation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T20%3A26%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Audiovusual%20automatic%20speech%20segmentation&rft.btitle=2011%20IEEE%2019th%20Signal%20Processing%20and%20Communications%20Applications%20Conference%20(SIU)&rft.au=Akdemir,%20E.&rft.date=2011-04&rft.spage=896&rft.epage=899&rft.pages=896-899&rft.issn=2165-0608&rft.eissn=2693-3616&rft.isbn=1457704625&rft.isbn_list=9781457704628&rft_id=info:doi/10.1109/SIU.2011.5929796&rft_dat=%3Cieee_6IE%3E5929796%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=1457704633&rft.eisbn_list=9781457704635&rft.eisbn_list=9781457704611&rft.eisbn_list=1457704617&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=5929796&rfr_iscdi=true