An Online Audio Indexing System

This paper presents overview of an online audio indexing system, which creates a searchable index of speech content embedded in digitized audio files. This system is based on our recently proposed offline audio segmentation techniques. As the data arrives continuously, the system first finds boundar...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Ajmera, Jitendra, McCowan, Iain A, Bourlard, Hervé
Format:	Web Resource
Sprache:	eng
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Ajmera, Jitendra McCowan, Iain A Bourlard, Hervé
description	This paper presents overview of an online audio indexing system, which creates a searchable index of speech content embedded in digitized audio files. This system is based on our recently proposed offline audio segmentation techniques. As the data arrives continuously, the system first finds boundaries of the acoustically homogenous segments. Next, each of these segments is classified as speech, music or {\it mixture} classes, where mixtures are defined as regions where speech and other non-speech sounds are present simultaneously and noticeably. The speech segments are then clustered together to provide consistent speaker labels. The speech and mixture segments are converted to text via an ASR system. The resulting words are time-stamped together with other metadata information (speaker identity, speech confidence score) in an XML file to rapidly identify and access target segments. In this paper, we analyze the performance at each stage of this audio indexing system and also compare it with the performance of the corresponding offline modules.
format	Web Resource
fullrecord	<record><control><sourceid>epfl_F1K</sourceid><recordid>TN_cdi_epfl_infoscience_oai_infoscience_tind_io_83069</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>oai_infoscience_tind_io_83069</sourcerecordid><originalsourceid>FETCH-epfl_infoscience_oai_infoscience_tind_io_830693</originalsourceid><addsrcrecordid>eNrjZJB3zFPwz8vJzEtVcCxNycxX8MxLSa3IzEtXCK4sLknN5WFgTUvMKU7lhdLcDKZuriHOHrqpBWk58Zl5afnFyZmpecmp8fmJmSj8ksy8lPjM_HgLYwMzS2Ny9QEA_Eg3JQ</addsrcrecordid><sourcetype>Institutional Repository</sourcetype><iscdi>true</iscdi><recordtype>web_resource</recordtype></control><display><type>web_resource</type><title>An Online Audio Indexing System</title><source>Infoscience: EPF Lausanne</source><creator>Ajmera, Jitendra ; McCowan, Iain A ; Bourlard, Hervé</creator><creatorcontrib>Ajmera, Jitendra ; McCowan, Iain A ; Bourlard, Hervé</creatorcontrib><description>This paper presents overview of an online audio indexing system, which creates a searchable index of speech content embedded in digitized audio files. This system is based on our recently proposed offline audio segmentation techniques. As the data arrives continuously, the system first finds boundaries of the acoustically homogenous segments. Next, each of these segments is classified as speech, music or {\it mixture} classes, where mixtures are defined as regions where speech and other non-speech sounds are present simultaneously and noticeably. The speech segments are then clustered together to provide consistent speaker labels. The speech and mixture segments are converted to text via an ASR system. The resulting words are time-stamped together with other metadata information (speaker identity, speech confidence score) in an XML file to rapidly identify and access target segments. In this paper, we analyze the performance at each stage of this audio indexing system and also compare it with the performance of the corresponding offline modules.</description><language>eng</language><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,780,27860</link.rule.ids><linktorsrc>$$Uhttp://infoscience.epfl.ch/record/83069$$EView_record_in_EPF_Lausanne$$FView_record_in_$$GEPF_Lausanne$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Ajmera, Jitendra</creatorcontrib><creatorcontrib>McCowan, Iain A</creatorcontrib><creatorcontrib>Bourlard, Hervé</creatorcontrib><title>An Online Audio Indexing System</title><description>This paper presents overview of an online audio indexing system, which creates a searchable index of speech content embedded in digitized audio files. This system is based on our recently proposed offline audio segmentation techniques. As the data arrives continuously, the system first finds boundaries of the acoustically homogenous segments. Next, each of these segments is classified as speech, music or {\it mixture} classes, where mixtures are defined as regions where speech and other non-speech sounds are present simultaneously and noticeably. The speech segments are then clustered together to provide consistent speaker labels. The speech and mixture segments are converted to text via an ASR system. The resulting words are time-stamped together with other metadata information (speaker identity, speech confidence score) in an XML file to rapidly identify and access target segments. In this paper, we analyze the performance at each stage of this audio indexing system and also compare it with the performance of the corresponding offline modules.</description><fulltext>true</fulltext><rsrctype>web_resource</rsrctype><recordtype>web_resource</recordtype><sourceid>F1K</sourceid><recordid>eNrjZJB3zFPwz8vJzEtVcCxNycxX8MxLSa3IzEtXCK4sLknN5WFgTUvMKU7lhdLcDKZuriHOHrqpBWk58Zl5afnFyZmpecmp8fmJmSj8ksy8lPjM_HgLYwMzS2Ny9QEA_Eg3JQ</recordid><creator>Ajmera, Jitendra</creator><creator>McCowan, Iain A</creator><creator>Bourlard, Hervé</creator><scope>F1K</scope></search><sort><title>An Online Audio Indexing System</title><author>Ajmera, Jitendra ; McCowan, Iain A ; Bourlard, Hervé</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epfl_infoscience_oai_infoscience_tind_io_830693</frbrgroupid><rsrctype>web_resources</rsrctype><prefilter>web_resources</prefilter><language>eng</language><toplevel>online_resources</toplevel><creatorcontrib>Ajmera, Jitendra</creatorcontrib><creatorcontrib>McCowan, Iain A</creatorcontrib><creatorcontrib>Bourlard, Hervé</creatorcontrib><collection>Infoscience: EPF Lausanne</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ajmera, Jitendra</au><au>McCowan, Iain A</au><au>Bourlard, Hervé</au><format>book</format><genre>unknown</genre><ristype>GEN</ristype><btitle>An Online Audio Indexing System</btitle><abstract>This paper presents overview of an online audio indexing system, which creates a searchable index of speech content embedded in digitized audio files. This system is based on our recently proposed offline audio segmentation techniques. As the data arrives continuously, the system first finds boundaries of the acoustically homogenous segments. Next, each of these segments is classified as speech, music or {\it mixture} classes, where mixtures are defined as regions where speech and other non-speech sounds are present simultaneously and noticeably. The speech segments are then clustered together to provide consistent speaker labels. The speech and mixture segments are converted to text via an ASR system. The resulting words are time-stamped together with other metadata information (speaker identity, speech confidence score) in an XML file to rapidly identify and access target segments. In this paper, we analyze the performance at each stage of this audio indexing system and also compare it with the performance of the corresponding offline modules.</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng
recordid	cdi_epfl_infoscience_oai_infoscience_tind_io_83069
source	Infoscience: EPF Lausanne
title	An Online Audio Indexing System
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T17%3A26%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epfl_F1K&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=unknown&rft.btitle=An%20Online%20Audio%20Indexing%20System&rft.au=Ajmera,%20Jitendra&rft_id=info:doi/&rft_dat=%3Cepfl_F1K%3Eoai_infoscience_tind_io_83069%3C/epfl_F1K%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true