Automatic Transcription of Polyphonic Music Based on the Constant-Q Bispectral Analysis

In the area of music information retrieval (MIR), automatic music transcription is considered one of the most challenging tasks, for which many different techniques have been proposed. This paper presents a new method for polyphonic music transcription: a system that aims at estimating pitch, onset...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on audio, speech, and language processing speech, and language processing, 2011-08, Vol.19 (6), p.1610-1630
Hauptverfasser: Argenti, F, Nesi, P, Pantaleo, G
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1630
container_issue 6
container_start_page 1610
container_title IEEE transactions on audio, speech, and language processing
container_volume 19
creator Argenti, F
Nesi, P
Pantaleo, G
description In the area of music information retrieval (MIR), automatic music transcription is considered one of the most challenging tasks, for which many different techniques have been proposed. This paper presents a new method for polyphonic music transcription: a system that aims at estimating pitch, onset times, durations, and intensity of concurrent sounds in audio recordings, played by one or more instruments. Pitch estimation is carried out by means of a front-end that jointly uses a constant-Q and a bispectral analysis of the input audio signal; subsequently, the processed signal is correlated with a fixed 2-D harmonic pattern. Onsets and durations detection procedures are based on the combination of the constant-Q bispectral analysis with information from the signal spectrogram. The detection process is agnostic and it does not need to take into account musicological and instrumental models or other a priori knowledge. The system has been validated against the standard Real-World Computing (RWC)-Classical Audio Database. The proposed method has demonstrated good performances in the multiple F 0 tracking task, especially for piano-only automatic transcription at MIREX 2009.
doi_str_mv 10.1109/TASL.2010.2093894
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_ieee_primary_5640655</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>5640655</ieee_id><sourcerecordid>2556605991</sourcerecordid><originalsourceid>FETCH-LOGICAL-c421t-c177b65706ed8a98a8de93eb1b9f4d4b68d8067586b4a3556327a2b735f7630e3</originalsourceid><addsrcrecordid>eNpdkE1rGzEQhpeSQpM0PyD0shQKuayj74-jbZK24NCWOuQotFotUVivthrtwf8-MjY-9KLR8D4zDE9V3WK0wBjp--3y72ZBUGkJ0lRp9qG6xJyrRmrCLs5_LD5VVwBvCDEqGL6sXpZzjjubg6u3yY7gUphyiGMd-_p3HPbTaxxL9jRDeVcWfFeXML_6eh1HyHbMzZ96FWDyLic71MvRDnsI8Ln62NsB_M2pXlfPjw_b9Y9m8-v7z_Vy0zhGcG4clrIVXCLhO2W1sqrzmvoWt7pnHWuF6hQSkivRMks5F5RIS1pJeS8FRZ5eV3fHvVOK_2YP2ewCOD8MdvRxBoOFxEQjhVRBv_6HvsU5lXvBaMyk0ELQAuEj5FIESL43Uwo7m_YGI3MwbQ6mzcG0OZkuM99Oiy04O_TFowtwHiSMYYo0KdyXIxe89-eYC4YE5_Qd2UOF3Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>914769663</pqid></control><display><type>article</type><title>Automatic Transcription of Polyphonic Music Based on the Constant-Q Bispectral Analysis</title><source>IEEE Electronic Library (IEL)</source><creator>Argenti, F ; Nesi, P ; Pantaleo, G</creator><creatorcontrib>Argenti, F ; Nesi, P ; Pantaleo, G</creatorcontrib><description>In the area of music information retrieval (MIR), automatic music transcription is considered one of the most challenging tasks, for which many different techniques have been proposed. This paper presents a new method for polyphonic music transcription: a system that aims at estimating pitch, onset times, durations, and intensity of concurrent sounds in audio recordings, played by one or more instruments. Pitch estimation is carried out by means of a front-end that jointly uses a constant-Q and a bispectral analysis of the input audio signal; subsequently, the processed signal is correlated with a fixed 2-D harmonic pattern. Onsets and durations detection procedures are based on the combination of the constant-Q bispectral analysis with information from the signal spectrogram. The detection process is agnostic and it does not need to take into account musicological and instrumental models or other a priori knowledge. The system has been validated against the standard Real-World Computing (RWC)-Classical Audio Database. The proposed method has demonstrated good performances in the multiple F 0 tracking task, especially for piano-only automatic transcription at MIREX 2009.</description><identifier>ISSN: 1558-7916</identifier><identifier>ISSN: 2329-9290</identifier><identifier>EISSN: 1558-7924</identifier><identifier>EISSN: 2329-9304</identifier><identifier>DOI: 10.1109/TASL.2010.2093894</identifier><identifier>CODEN: ITASD8</identifier><language>eng</language><publisher>Piscataway, NJ: IEEE</publisher><subject>Applied sciences ; Audio signals ; Audio signals processing ; Bispectral analysis ; bispectrum ; constant-Q analysis ; Detection, estimation, filtering, equalization, prediction ; Estimating ; Estimation ; Exact sciences and technology ; Filter bank ; Fourier transforms ; Harmonic analysis ; Hidden Markov models ; higher order spectra ; Information retrieval ; Information theory ; Information, signal and communications theory ; Mathematical models ; Miscellaneous ; Music ; music information retrieval (MIR) ; polyphonic music transcription ; Psychoacoustic models ; Recording ; Signal and communications theory ; Signal processing ; Signal representation. Spectral analysis ; Signal, noise ; Spectral analysis ; Studies ; Tasks ; Telecommunications and information theory</subject><ispartof>IEEE transactions on audio, speech, and language processing, 2011-08, Vol.19 (6), p.1610-1630</ispartof><rights>2015 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Aug 2011</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c421t-c177b65706ed8a98a8de93eb1b9f4d4b68d8067586b4a3556327a2b735f7630e3</citedby><cites>FETCH-LOGICAL-c421t-c177b65706ed8a98a8de93eb1b9f4d4b68d8067586b4a3556327a2b735f7630e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/5640655$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27903,27904,54737</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/5640655$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=24413092$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Argenti, F</creatorcontrib><creatorcontrib>Nesi, P</creatorcontrib><creatorcontrib>Pantaleo, G</creatorcontrib><title>Automatic Transcription of Polyphonic Music Based on the Constant-Q Bispectral Analysis</title><title>IEEE transactions on audio, speech, and language processing</title><addtitle>TASL</addtitle><description>In the area of music information retrieval (MIR), automatic music transcription is considered one of the most challenging tasks, for which many different techniques have been proposed. This paper presents a new method for polyphonic music transcription: a system that aims at estimating pitch, onset times, durations, and intensity of concurrent sounds in audio recordings, played by one or more instruments. Pitch estimation is carried out by means of a front-end that jointly uses a constant-Q and a bispectral analysis of the input audio signal; subsequently, the processed signal is correlated with a fixed 2-D harmonic pattern. Onsets and durations detection procedures are based on the combination of the constant-Q bispectral analysis with information from the signal spectrogram. The detection process is agnostic and it does not need to take into account musicological and instrumental models or other a priori knowledge. The system has been validated against the standard Real-World Computing (RWC)-Classical Audio Database. The proposed method has demonstrated good performances in the multiple F 0 tracking task, especially for piano-only automatic transcription at MIREX 2009.</description><subject>Applied sciences</subject><subject>Audio signals</subject><subject>Audio signals processing</subject><subject>Bispectral analysis</subject><subject>bispectrum</subject><subject>constant-Q analysis</subject><subject>Detection, estimation, filtering, equalization, prediction</subject><subject>Estimating</subject><subject>Estimation</subject><subject>Exact sciences and technology</subject><subject>Filter bank</subject><subject>Fourier transforms</subject><subject>Harmonic analysis</subject><subject>Hidden Markov models</subject><subject>higher order spectra</subject><subject>Information retrieval</subject><subject>Information theory</subject><subject>Information, signal and communications theory</subject><subject>Mathematical models</subject><subject>Miscellaneous</subject><subject>Music</subject><subject>music information retrieval (MIR)</subject><subject>polyphonic music transcription</subject><subject>Psychoacoustic models</subject><subject>Recording</subject><subject>Signal and communications theory</subject><subject>Signal processing</subject><subject>Signal representation. Spectral analysis</subject><subject>Signal, noise</subject><subject>Spectral analysis</subject><subject>Studies</subject><subject>Tasks</subject><subject>Telecommunications and information theory</subject><issn>1558-7916</issn><issn>2329-9290</issn><issn>1558-7924</issn><issn>2329-9304</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpdkE1rGzEQhpeSQpM0PyD0shQKuayj74-jbZK24NCWOuQotFotUVivthrtwf8-MjY-9KLR8D4zDE9V3WK0wBjp--3y72ZBUGkJ0lRp9qG6xJyrRmrCLs5_LD5VVwBvCDEqGL6sXpZzjjubg6u3yY7gUphyiGMd-_p3HPbTaxxL9jRDeVcWfFeXML_6eh1HyHbMzZ96FWDyLic71MvRDnsI8Ln62NsB_M2pXlfPjw_b9Y9m8-v7z_Vy0zhGcG4clrIVXCLhO2W1sqrzmvoWt7pnHWuF6hQSkivRMks5F5RIS1pJeS8FRZ5eV3fHvVOK_2YP2ewCOD8MdvRxBoOFxEQjhVRBv_6HvsU5lXvBaMyk0ELQAuEj5FIESL43Uwo7m_YGI3MwbQ6mzcG0OZkuM99Oiy04O_TFowtwHiSMYYo0KdyXIxe89-eYC4YE5_Qd2UOF3Q</recordid><startdate>20110801</startdate><enddate>20110801</enddate><creator>Argenti, F</creator><creator>Nesi, P</creator><creator>Pantaleo, G</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20110801</creationdate><title>Automatic Transcription of Polyphonic Music Based on the Constant-Q Bispectral Analysis</title><author>Argenti, F ; Nesi, P ; Pantaleo, G</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c421t-c177b65706ed8a98a8de93eb1b9f4d4b68d8067586b4a3556327a2b735f7630e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Applied sciences</topic><topic>Audio signals</topic><topic>Audio signals processing</topic><topic>Bispectral analysis</topic><topic>bispectrum</topic><topic>constant-Q analysis</topic><topic>Detection, estimation, filtering, equalization, prediction</topic><topic>Estimating</topic><topic>Estimation</topic><topic>Exact sciences and technology</topic><topic>Filter bank</topic><topic>Fourier transforms</topic><topic>Harmonic analysis</topic><topic>Hidden Markov models</topic><topic>higher order spectra</topic><topic>Information retrieval</topic><topic>Information theory</topic><topic>Information, signal and communications theory</topic><topic>Mathematical models</topic><topic>Miscellaneous</topic><topic>Music</topic><topic>music information retrieval (MIR)</topic><topic>polyphonic music transcription</topic><topic>Psychoacoustic models</topic><topic>Recording</topic><topic>Signal and communications theory</topic><topic>Signal processing</topic><topic>Signal representation. Spectral analysis</topic><topic>Signal, noise</topic><topic>Spectral analysis</topic><topic>Studies</topic><topic>Tasks</topic><topic>Telecommunications and information theory</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Argenti, F</creatorcontrib><creatorcontrib>Nesi, P</creatorcontrib><creatorcontrib>Pantaleo, G</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on audio, speech, and language processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Argenti, F</au><au>Nesi, P</au><au>Pantaleo, G</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Automatic Transcription of Polyphonic Music Based on the Constant-Q Bispectral Analysis</atitle><jtitle>IEEE transactions on audio, speech, and language processing</jtitle><stitle>TASL</stitle><date>2011-08-01</date><risdate>2011</risdate><volume>19</volume><issue>6</issue><spage>1610</spage><epage>1630</epage><pages>1610-1630</pages><issn>1558-7916</issn><issn>2329-9290</issn><eissn>1558-7924</eissn><eissn>2329-9304</eissn><coden>ITASD8</coden><abstract>In the area of music information retrieval (MIR), automatic music transcription is considered one of the most challenging tasks, for which many different techniques have been proposed. This paper presents a new method for polyphonic music transcription: a system that aims at estimating pitch, onset times, durations, and intensity of concurrent sounds in audio recordings, played by one or more instruments. Pitch estimation is carried out by means of a front-end that jointly uses a constant-Q and a bispectral analysis of the input audio signal; subsequently, the processed signal is correlated with a fixed 2-D harmonic pattern. Onsets and durations detection procedures are based on the combination of the constant-Q bispectral analysis with information from the signal spectrogram. The detection process is agnostic and it does not need to take into account musicological and instrumental models or other a priori knowledge. The system has been validated against the standard Real-World Computing (RWC)-Classical Audio Database. The proposed method has demonstrated good performances in the multiple F 0 tracking task, especially for piano-only automatic transcription at MIREX 2009.</abstract><cop>Piscataway, NJ</cop><pub>IEEE</pub><doi>10.1109/TASL.2010.2093894</doi><tpages>21</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1558-7916
ispartof IEEE transactions on audio, speech, and language processing, 2011-08, Vol.19 (6), p.1610-1630
issn 1558-7916
2329-9290
1558-7924
2329-9304
language eng
recordid cdi_ieee_primary_5640655
source IEEE Electronic Library (IEL)
subjects Applied sciences
Audio signals
Audio signals processing
Bispectral analysis
bispectrum
constant-Q analysis
Detection, estimation, filtering, equalization, prediction
Estimating
Estimation
Exact sciences and technology
Filter bank
Fourier transforms
Harmonic analysis
Hidden Markov models
higher order spectra
Information retrieval
Information theory
Information, signal and communications theory
Mathematical models
Miscellaneous
Music
music information retrieval (MIR)
polyphonic music transcription
Psychoacoustic models
Recording
Signal and communications theory
Signal processing
Signal representation. Spectral analysis
Signal, noise
Spectral analysis
Studies
Tasks
Telecommunications and information theory
title Automatic Transcription of Polyphonic Music Based on the Constant-Q Bispectral Analysis
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T04%3A10%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Automatic%20Transcription%20of%20Polyphonic%20Music%20Based%20on%20the%20Constant-Q%20Bispectral%20Analysis&rft.jtitle=IEEE%20transactions%20on%20audio,%20speech,%20and%20language%20processing&rft.au=Argenti,%20F&rft.date=2011-08-01&rft.volume=19&rft.issue=6&rft.spage=1610&rft.epage=1630&rft.pages=1610-1630&rft.issn=1558-7916&rft.eissn=1558-7924&rft.coden=ITASD8&rft_id=info:doi/10.1109/TASL.2010.2093894&rft_dat=%3Cproquest_RIE%3E2556605991%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=914769663&rft_id=info:pmid/&rft_ieee_id=5640655&rfr_iscdi=true