Robust detection and classification of objects in audio using limited training data

A method (200) and apparatus (100) for classifying a homogeneous audio segment are disclosed. The homogeneous audio comprises a sequence of audio samples (x(n)). The method (200) starts by forming a sequence of frames (701-704) along the sequence of audio samples (x(n)), each frame (701-704) compris...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	WARK TIMOTHY JOHN
Format:	Patent
Sprache:	eng
Schlagworte:	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	WARK TIMOTHY JOHN
description	A method (200) and apparatus (100) for classifying a homogeneous audio segment are disclosed. The homogeneous audio comprises a sequence of audio samples (x(n)). The method (200) starts by forming a sequence of frames (701-704) along the sequence of audio samples (x(n)), each frame (701-704) comprising a plurality of the audio samples (x(n)). The homogeneous audio segment is next divided (206) into a plurality of audio clips (711-714), with each audio clip being associated with a plurality of the frames (701-704). The method (200) then extracts (208) at least one frame feature for each clip (711-714). A clip feature vector (f) is next extracted from frame features of frames associated with the audio clip (711-714). Finally the segment is classified based on a continuous function defining the distribution of the clip feature vectors (f).
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_US2003231775A1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>US2003231775A1</sourcerecordid><originalsourceid>FETCH-epo_espacenet_US2003231775A13</originalsourceid><addsrcrecordid>eNqNikEKwjAQAHPxIOofFjwLbYP0LKJ4tnou22RTVmJS3M3_reIDPA3MzNJ01zwUUfCk5JRzAkweXEQRDuzwq3KAPDzmLsDzUDxnKMJphMhPVvKgL-T0ER4V12YRMAptflyZ7fl0O152NOWeZEJHibS_d01V2cbWbbs_1Pa_6w1NNzlz</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Robust detection and classification of objects in audio using limited training data</title><source>esp@cenet</source><creator>WARK TIMOTHY JOHN</creator><creatorcontrib>WARK TIMOTHY JOHN</creatorcontrib><description>A method (200) and apparatus (100) for classifying a homogeneous audio segment are disclosed. The homogeneous audio comprises a sequence of audio samples (x(n)). The method (200) starts by forming a sequence of frames (701-704) along the sequence of audio samples (x(n)), each frame (701-704) comprising a plurality of the audio samples (x(n)). The homogeneous audio segment is next divided (206) into a plurality of audio clips (711-714), with each audio clip being associated with a plurality of the frames (701-704). The method (200) then extracts (208) at least one frame feature for each clip (711-714). A clip feature vector (f) is next extracted from frame features of frames associated with the audio clip (711-714). Finally the segment is classified based on a continuous function defining the distribution of the clip feature vectors (f).</description><edition>7</edition><language>eng</language><subject>ACOUSTICS ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2003</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20031218&DB=EPODOC&CC=US&NR=2003231775A1$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,776,881,25543,76293</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20031218&DB=EPODOC&CC=US&NR=2003231775A1$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>WARK TIMOTHY JOHN</creatorcontrib><title>Robust detection and classification of objects in audio using limited training data</title><description>A method (200) and apparatus (100) for classifying a homogeneous audio segment are disclosed. The homogeneous audio comprises a sequence of audio samples (x(n)). The method (200) starts by forming a sequence of frames (701-704) along the sequence of audio samples (x(n)), each frame (701-704) comprising a plurality of the audio samples (x(n)). The homogeneous audio segment is next divided (206) into a plurality of audio clips (711-714), with each audio clip being associated with a plurality of the frames (701-704). The method (200) then extracts (208) at least one frame feature for each clip (711-714). A clip feature vector (f) is next extracted from frame features of frames associated with the audio clip (711-714). Finally the segment is classified based on a continuous function defining the distribution of the clip feature vectors (f).</description><subject>ACOUSTICS</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2003</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNqNikEKwjAQAHPxIOofFjwLbYP0LKJ4tnou22RTVmJS3M3_reIDPA3MzNJ01zwUUfCk5JRzAkweXEQRDuzwq3KAPDzmLsDzUDxnKMJphMhPVvKgL-T0ER4V12YRMAptflyZ7fl0O152NOWeZEJHibS_d01V2cbWbbs_1Pa_6w1NNzlz</recordid><startdate>20031218</startdate><enddate>20031218</enddate><creator>WARK TIMOTHY JOHN</creator><scope>EVB</scope></search><sort><creationdate>20031218</creationdate><title>Robust detection and classification of objects in audio using limited training data</title><author>WARK TIMOTHY JOHN</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_US2003231775A13</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng</language><creationdate>2003</creationdate><topic>ACOUSTICS</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>WARK TIMOTHY JOHN</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>WARK TIMOTHY JOHN</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Robust detection and classification of objects in audio using limited training data</title><date>2003-12-18</date><risdate>2003</risdate><abstract>A method (200) and apparatus (100) for classifying a homogeneous audio segment are disclosed. The homogeneous audio comprises a sequence of audio samples (x(n)). The method (200) starts by forming a sequence of frames (701-704) along the sequence of audio samples (x(n)), each frame (701-704) comprising a plurality of the audio samples (x(n)). The homogeneous audio segment is next divided (206) into a plurality of audio clips (711-714), with each audio clip being associated with a plurality of the frames (701-704). The method (200) then extracts (208) at least one frame feature for each clip (711-714). A clip feature vector (f) is next extracted from frame features of frames associated with the audio clip (711-714). Finally the segment is classified based on a continuous function defining the distribution of the clip feature vectors (f).</abstract><edition>7</edition><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng
recordid	cdi_epo_espacenet_US2003231775A1
source	esp@cenet
subjects	ACOUSTICS MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
title	Robust detection and classification of objects in audio using limited training data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T21%3A39%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=WARK%20TIMOTHY%20JOHN&rft.date=2003-12-18&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EUS2003231775A1%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true