A Novel Speech Feature Fusion Algorithm for Text-Independent Speaker Recognition
A novel speech feature fusion algorithm with independent vector analysis (IVA) and parallel convolutional neural network (PCNN) is proposed for text-independent speaker recognition. Firstly, some different feature types, such as the time domain (TD) features and the frequency domain (FD) features, c...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Ma, Biao Xu, Chengben Zhang, Ye |
description | A novel speech feature fusion algorithm with independent vector analysis
(IVA) and parallel convolutional neural network (PCNN) is proposed for
text-independent speaker recognition. Firstly, some different feature types,
such as the time domain (TD) features and the frequency domain (FD) features,
can be extracted from a speaker's speech, and the TD and the FD features can be
considered as the linear mixtures of independent feature components (IFCs) with
an unknown mixing system. To estimate the IFCs, the TD and the FD features of
the speaker's speech are concatenated to build the TD and the FD feature
matrix, respectively. Then, a feature tensor of the speaker's speech is
obtained by paralleling the TD and the FD feature matrix. To enhance the
dependence on different feature types and remove the redundancies of the same
feature type, the independent vector analysis (IVA) can be used to estimate the
IFC matrices of TD and FD features with the feature tensor. The IFC matrices
are utilized as the input of the PCNN to extract the deep features of the TD
and FD features, respectively. The deep features can be integrated to obtain
the fusion feature of the speaker's speech. Finally, the fusion feature of the
speaker's speech is employed as the input of a deep convolutional neural
network (DCNN) classifier for speaker recognition. The experimental results
show the effectiveness and performances of the proposed speaker recognition
system. |
doi_str_mv | 10.48550/arxiv.2212.00329 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2212_00329</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2212_00329</sourcerecordid><originalsourceid>FETCH-LOGICAL-a679-c14728c33a4630bc3394e703933615a12332640e49e7996ea1a41d21bf37c1233</originalsourceid><addsrcrecordid>eNotj0FOwzAURL1hgQoHYFVfIMH2d-x6GVUEKlWAIPvIdX9aq2kcuW5Vbk9S2MyMNJqRHiFPnOVyURTs2carv-RCcJEzBsLck8-SvocLdvR7QHR7WqFN54i0Op986GnZ7UL0aX-kbYi0xmvKVv0WBxylT9PIHjDSL3Rh1_s0Th7IXWu7Ez7--4zU1Uu9fMvWH6-rZbnOrNImc1xqsXAAVipgmzEYiZqBAVC8sFwACCUZSoPaGIWWW8m3gm9a0G5qZ2T-d3tDaobojzb-NBNac0ODX4D4R6g</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Novel Speech Feature Fusion Algorithm for Text-Independent Speaker Recognition</title><source>arXiv.org</source><creator>Ma, Biao ; Xu, Chengben ; Zhang, Ye</creator><creatorcontrib>Ma, Biao ; Xu, Chengben ; Zhang, Ye</creatorcontrib><description>A novel speech feature fusion algorithm with independent vector analysis
(IVA) and parallel convolutional neural network (PCNN) is proposed for
text-independent speaker recognition. Firstly, some different feature types,
such as the time domain (TD) features and the frequency domain (FD) features,
can be extracted from a speaker's speech, and the TD and the FD features can be
considered as the linear mixtures of independent feature components (IFCs) with
an unknown mixing system. To estimate the IFCs, the TD and the FD features of
the speaker's speech are concatenated to build the TD and the FD feature
matrix, respectively. Then, a feature tensor of the speaker's speech is
obtained by paralleling the TD and the FD feature matrix. To enhance the
dependence on different feature types and remove the redundancies of the same
feature type, the independent vector analysis (IVA) can be used to estimate the
IFC matrices of TD and FD features with the feature tensor. The IFC matrices
are utilized as the input of the PCNN to extract the deep features of the TD
and FD features, respectively. The deep features can be integrated to obtain
the fusion feature of the speaker's speech. Finally, the fusion feature of the
speaker's speech is employed as the input of a deep convolutional neural
network (DCNN) classifier for speaker recognition. The experimental results
show the effectiveness and performances of the proposed speaker recognition
system.</description><identifier>DOI: 10.48550/arxiv.2212.00329</identifier><language>eng</language><creationdate>2022-12</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2212.00329$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2212.00329$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Ma, Biao</creatorcontrib><creatorcontrib>Xu, Chengben</creatorcontrib><creatorcontrib>Zhang, Ye</creatorcontrib><title>A Novel Speech Feature Fusion Algorithm for Text-Independent Speaker Recognition</title><description>A novel speech feature fusion algorithm with independent vector analysis
(IVA) and parallel convolutional neural network (PCNN) is proposed for
text-independent speaker recognition. Firstly, some different feature types,
such as the time domain (TD) features and the frequency domain (FD) features,
can be extracted from a speaker's speech, and the TD and the FD features can be
considered as the linear mixtures of independent feature components (IFCs) with
an unknown mixing system. To estimate the IFCs, the TD and the FD features of
the speaker's speech are concatenated to build the TD and the FD feature
matrix, respectively. Then, a feature tensor of the speaker's speech is
obtained by paralleling the TD and the FD feature matrix. To enhance the
dependence on different feature types and remove the redundancies of the same
feature type, the independent vector analysis (IVA) can be used to estimate the
IFC matrices of TD and FD features with the feature tensor. The IFC matrices
are utilized as the input of the PCNN to extract the deep features of the TD
and FD features, respectively. The deep features can be integrated to obtain
the fusion feature of the speaker's speech. Finally, the fusion feature of the
speaker's speech is employed as the input of a deep convolutional neural
network (DCNN) classifier for speaker recognition. The experimental results
show the effectiveness and performances of the proposed speaker recognition
system.</description><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj0FOwzAURL1hgQoHYFVfIMH2d-x6GVUEKlWAIPvIdX9aq2kcuW5Vbk9S2MyMNJqRHiFPnOVyURTs2carv-RCcJEzBsLck8-SvocLdvR7QHR7WqFN54i0Op986GnZ7UL0aX-kbYi0xmvKVv0WBxylT9PIHjDSL3Rh1_s0Th7IXWu7Ez7--4zU1Uu9fMvWH6-rZbnOrNImc1xqsXAAVipgmzEYiZqBAVC8sFwACCUZSoPaGIWWW8m3gm9a0G5qZ2T-d3tDaobojzb-NBNac0ODX4D4R6g</recordid><startdate>20221201</startdate><enddate>20221201</enddate><creator>Ma, Biao</creator><creator>Xu, Chengben</creator><creator>Zhang, Ye</creator><scope>GOX</scope></search><sort><creationdate>20221201</creationdate><title>A Novel Speech Feature Fusion Algorithm for Text-Independent Speaker Recognition</title><author>Ma, Biao ; Xu, Chengben ; Zhang, Ye</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a679-c14728c33a4630bc3394e703933615a12332640e49e7996ea1a41d21bf37c1233</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><toplevel>online_resources</toplevel><creatorcontrib>Ma, Biao</creatorcontrib><creatorcontrib>Xu, Chengben</creatorcontrib><creatorcontrib>Zhang, Ye</creatorcontrib><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Ma, Biao</au><au>Xu, Chengben</au><au>Zhang, Ye</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Novel Speech Feature Fusion Algorithm for Text-Independent Speaker Recognition</atitle><date>2022-12-01</date><risdate>2022</risdate><abstract>A novel speech feature fusion algorithm with independent vector analysis
(IVA) and parallel convolutional neural network (PCNN) is proposed for
text-independent speaker recognition. Firstly, some different feature types,
such as the time domain (TD) features and the frequency domain (FD) features,
can be extracted from a speaker's speech, and the TD and the FD features can be
considered as the linear mixtures of independent feature components (IFCs) with
an unknown mixing system. To estimate the IFCs, the TD and the FD features of
the speaker's speech are concatenated to build the TD and the FD feature
matrix, respectively. Then, a feature tensor of the speaker's speech is
obtained by paralleling the TD and the FD feature matrix. To enhance the
dependence on different feature types and remove the redundancies of the same
feature type, the independent vector analysis (IVA) can be used to estimate the
IFC matrices of TD and FD features with the feature tensor. The IFC matrices
are utilized as the input of the PCNN to extract the deep features of the TD
and FD features, respectively. The deep features can be integrated to obtain
the fusion feature of the speaker's speech. Finally, the fusion feature of the
speaker's speech is employed as the input of a deep convolutional neural
network (DCNN) classifier for speaker recognition. The experimental results
show the effectiveness and performances of the proposed speaker recognition
system.</abstract><doi>10.48550/arxiv.2212.00329</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2212.00329 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2212_00329 |
source | arXiv.org |
title | A Novel Speech Feature Fusion Algorithm for Text-Independent Speaker Recognition |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T12%3A29%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Novel%20Speech%20Feature%20Fusion%20Algorithm%20for%20Text-Independent%20Speaker%20Recognition&rft.au=Ma,%20Biao&rft.date=2022-12-01&rft_id=info:doi/10.48550/arxiv.2212.00329&rft_dat=%3Carxiv_GOX%3E2212_00329%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |