IF-MCA: Importance Factor-Based Multiple Correspondence Analysis for Multimedia Data Analytics

Multimedia concept detection is a challenging topic due to the well-known class imbalance issue, where the data instances are distributed unevenly across different classes. This problem becomes even more prominent when the minority class that contains an extremely small proportion of the data repres...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on multimedia 2018-04, Vol.20 (4), p.1024-1032
Hauptverfasser: Yang, Yimin, Pouyanfar, Samira, Tian, Haiman, Chen, Min, Chen, Shu-Ching, Shyu, Mei-Ling
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1032
container_issue 4
container_start_page 1024
container_title IEEE transactions on multimedia
container_volume 20
creator Yang, Yimin
Pouyanfar, Samira
Tian, Haiman
Chen, Min
Chen, Shu-Ching
Shyu, Mei-Ling
description Multimedia concept detection is a challenging topic due to the well-known class imbalance issue, where the data instances are distributed unevenly across different classes. This problem becomes even more prominent when the minority class that contains an extremely small proportion of the data represents the concept of interest as has occurred in many real-world applications such as frauds in banking transactions and goal events in soccer videos. Traditional data mining approaches often have difficulty handling largely skewed data distributions. To address this issue, in this paper, an importance-factor (IF)-based multiple correspondence analysis (MCA) framework is proposed to deal with the imbalanced datasets. Specifically, a hierarchical information gain analysis method, which is inspired by the decision tree algorithm, is presented for critical feature selection and IF assignment. Then, the derived IF is incorporated with the MCA algorithm for effective concept detection and retrieval. The comparison results in video concept detection using the disaster dataset and the soccer dataset demonstrate the effectiveness of the proposed framework.
doi_str_mv 10.1109/TMM.2017.2760623
format Article
fullrecord <record><control><sourceid>crossref_RIE</sourceid><recordid>TN_cdi_ieee_primary_8060571</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8060571</ieee_id><sourcerecordid>10_1109_TMM_2017_2760623</sourcerecordid><originalsourceid>FETCH-LOGICAL-c263t-3b7cab1704ed0051a275ec757376cf4a3ac3325991f82faeeee56bb45b338d6b3</originalsourceid><addsrcrecordid>eNo9kLFOwzAQhi0EEqWwI7HkBVzu7NhO2EpooVIjlrISXRxHCkqayA5D355UqbjlTrrv_4ePsUeEFSKkz4c8XwlAsxJGgxbyii0wjZEDGHM93UoATwXCLbsL4QcAYwVmwb53W55n65do1w29H-loXbQlO_aev1JwVZT_tmMztC7Keu9dGPpj5c7Q-kjtKTQhqns_Q52rGoreaKT5OTY23LObmtrgHi57yb62m0P2wfef77tsvedWaDlyWRpLJRqIXQWgkIRRzhplpNG2jkmSlVKoNMU6ETW5aZQuy1iVUiaVLuWSwdxrfR-Cd3Ux-KYjfyoQirOfYvJTnP0UFz9T5GmONFPbP56ABmVQ_gGukmHB</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>IF-MCA: Importance Factor-Based Multiple Correspondence Analysis for Multimedia Data Analytics</title><source>IEEE Electronic Library (IEL)</source><creator>Yang, Yimin ; Pouyanfar, Samira ; Tian, Haiman ; Chen, Min ; Chen, Shu-Ching ; Shyu, Mei-Ling</creator><creatorcontrib>Yang, Yimin ; Pouyanfar, Samira ; Tian, Haiman ; Chen, Min ; Chen, Shu-Ching ; Shyu, Mei-Ling</creatorcontrib><description>Multimedia concept detection is a challenging topic due to the well-known class imbalance issue, where the data instances are distributed unevenly across different classes. This problem becomes even more prominent when the minority class that contains an extremely small proportion of the data represents the concept of interest as has occurred in many real-world applications such as frauds in banking transactions and goal events in soccer videos. Traditional data mining approaches often have difficulty handling largely skewed data distributions. To address this issue, in this paper, an importance-factor (IF)-based multiple correspondence analysis (MCA) framework is proposed to deal with the imbalanced datasets. Specifically, a hierarchical information gain analysis method, which is inspired by the decision tree algorithm, is presented for critical feature selection and IF assignment. Then, the derived IF is incorporated with the MCA algorithm for effective concept detection and retrieval. The comparison results in video concept detection using the disaster dataset and the soccer dataset demonstrate the effectiveness of the proposed framework.</description><identifier>ISSN: 1520-9210</identifier><identifier>EISSN: 1941-0077</identifier><identifier>DOI: 10.1109/TMM.2017.2760623</identifier><identifier>CODEN: ITMUF8</identifier><language>eng</language><publisher>IEEE</publisher><subject>Algorithm design and analysis ; Data mining ; Decision trees ; Feature extraction ; feature selection ; Importance factor ; information gain ; Multimedia communication ; multiple correspondence analysis (MCA) ; Testing ; Training</subject><ispartof>IEEE transactions on multimedia, 2018-04, Vol.20 (4), p.1024-1032</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c263t-3b7cab1704ed0051a275ec757376cf4a3ac3325991f82faeeee56bb45b338d6b3</citedby><cites>FETCH-LOGICAL-c263t-3b7cab1704ed0051a275ec757376cf4a3ac3325991f82faeeee56bb45b338d6b3</cites><orcidid>0000-0003-0902-0844 ; 0000-0002-8363-8514 ; 0000-0001-9209-390X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8060571$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,793,27905,27906,54739</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/8060571$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Yang, Yimin</creatorcontrib><creatorcontrib>Pouyanfar, Samira</creatorcontrib><creatorcontrib>Tian, Haiman</creatorcontrib><creatorcontrib>Chen, Min</creatorcontrib><creatorcontrib>Chen, Shu-Ching</creatorcontrib><creatorcontrib>Shyu, Mei-Ling</creatorcontrib><title>IF-MCA: Importance Factor-Based Multiple Correspondence Analysis for Multimedia Data Analytics</title><title>IEEE transactions on multimedia</title><addtitle>TMM</addtitle><description>Multimedia concept detection is a challenging topic due to the well-known class imbalance issue, where the data instances are distributed unevenly across different classes. This problem becomes even more prominent when the minority class that contains an extremely small proportion of the data represents the concept of interest as has occurred in many real-world applications such as frauds in banking transactions and goal events in soccer videos. Traditional data mining approaches often have difficulty handling largely skewed data distributions. To address this issue, in this paper, an importance-factor (IF)-based multiple correspondence analysis (MCA) framework is proposed to deal with the imbalanced datasets. Specifically, a hierarchical information gain analysis method, which is inspired by the decision tree algorithm, is presented for critical feature selection and IF assignment. Then, the derived IF is incorporated with the MCA algorithm for effective concept detection and retrieval. The comparison results in video concept detection using the disaster dataset and the soccer dataset demonstrate the effectiveness of the proposed framework.</description><subject>Algorithm design and analysis</subject><subject>Data mining</subject><subject>Decision trees</subject><subject>Feature extraction</subject><subject>feature selection</subject><subject>Importance factor</subject><subject>information gain</subject><subject>Multimedia communication</subject><subject>multiple correspondence analysis (MCA)</subject><subject>Testing</subject><subject>Training</subject><issn>1520-9210</issn><issn>1941-0077</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNo9kLFOwzAQhi0EEqWwI7HkBVzu7NhO2EpooVIjlrISXRxHCkqayA5D355UqbjlTrrv_4ePsUeEFSKkz4c8XwlAsxJGgxbyii0wjZEDGHM93UoATwXCLbsL4QcAYwVmwb53W55n65do1w29H-loXbQlO_aev1JwVZT_tmMztC7Keu9dGPpj5c7Q-kjtKTQhqns_Q52rGoreaKT5OTY23LObmtrgHi57yb62m0P2wfef77tsvedWaDlyWRpLJRqIXQWgkIRRzhplpNG2jkmSlVKoNMU6ETW5aZQuy1iVUiaVLuWSwdxrfR-Cd3Ux-KYjfyoQirOfYvJTnP0UFz9T5GmONFPbP56ABmVQ_gGukmHB</recordid><startdate>201804</startdate><enddate>201804</enddate><creator>Yang, Yimin</creator><creator>Pouyanfar, Samira</creator><creator>Tian, Haiman</creator><creator>Chen, Min</creator><creator>Chen, Shu-Ching</creator><creator>Shyu, Mei-Ling</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-0902-0844</orcidid><orcidid>https://orcid.org/0000-0002-8363-8514</orcidid><orcidid>https://orcid.org/0000-0001-9209-390X</orcidid></search><sort><creationdate>201804</creationdate><title>IF-MCA: Importance Factor-Based Multiple Correspondence Analysis for Multimedia Data Analytics</title><author>Yang, Yimin ; Pouyanfar, Samira ; Tian, Haiman ; Chen, Min ; Chen, Shu-Ching ; Shyu, Mei-Ling</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c263t-3b7cab1704ed0051a275ec757376cf4a3ac3325991f82faeeee56bb45b338d6b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithm design and analysis</topic><topic>Data mining</topic><topic>Decision trees</topic><topic>Feature extraction</topic><topic>feature selection</topic><topic>Importance factor</topic><topic>information gain</topic><topic>Multimedia communication</topic><topic>multiple correspondence analysis (MCA)</topic><topic>Testing</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yang, Yimin</creatorcontrib><creatorcontrib>Pouyanfar, Samira</creatorcontrib><creatorcontrib>Tian, Haiman</creatorcontrib><creatorcontrib>Chen, Min</creatorcontrib><creatorcontrib>Chen, Shu-Ching</creatorcontrib><creatorcontrib>Shyu, Mei-Ling</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on multimedia</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yang, Yimin</au><au>Pouyanfar, Samira</au><au>Tian, Haiman</au><au>Chen, Min</au><au>Chen, Shu-Ching</au><au>Shyu, Mei-Ling</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>IF-MCA: Importance Factor-Based Multiple Correspondence Analysis for Multimedia Data Analytics</atitle><jtitle>IEEE transactions on multimedia</jtitle><stitle>TMM</stitle><date>2018-04</date><risdate>2018</risdate><volume>20</volume><issue>4</issue><spage>1024</spage><epage>1032</epage><pages>1024-1032</pages><issn>1520-9210</issn><eissn>1941-0077</eissn><coden>ITMUF8</coden><abstract>Multimedia concept detection is a challenging topic due to the well-known class imbalance issue, where the data instances are distributed unevenly across different classes. This problem becomes even more prominent when the minority class that contains an extremely small proportion of the data represents the concept of interest as has occurred in many real-world applications such as frauds in banking transactions and goal events in soccer videos. Traditional data mining approaches often have difficulty handling largely skewed data distributions. To address this issue, in this paper, an importance-factor (IF)-based multiple correspondence analysis (MCA) framework is proposed to deal with the imbalanced datasets. Specifically, a hierarchical information gain analysis method, which is inspired by the decision tree algorithm, is presented for critical feature selection and IF assignment. Then, the derived IF is incorporated with the MCA algorithm for effective concept detection and retrieval. The comparison results in video concept detection using the disaster dataset and the soccer dataset demonstrate the effectiveness of the proposed framework.</abstract><pub>IEEE</pub><doi>10.1109/TMM.2017.2760623</doi><tpages>9</tpages><orcidid>https://orcid.org/0000-0003-0902-0844</orcidid><orcidid>https://orcid.org/0000-0002-8363-8514</orcidid><orcidid>https://orcid.org/0000-0001-9209-390X</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1520-9210
ispartof IEEE transactions on multimedia, 2018-04, Vol.20 (4), p.1024-1032
issn 1520-9210
1941-0077
language eng
recordid cdi_ieee_primary_8060571
source IEEE Electronic Library (IEL)
subjects Algorithm design and analysis
Data mining
Decision trees
Feature extraction
feature selection
Importance factor
information gain
Multimedia communication
multiple correspondence analysis (MCA)
Testing
Training
title IF-MCA: Importance Factor-Based Multiple Correspondence Analysis for Multimedia Data Analytics
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T12%3A43%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=IF-MCA:%20Importance%20Factor-Based%20Multiple%20Correspondence%20Analysis%20for%20Multimedia%20Data%20Analytics&rft.jtitle=IEEE%20transactions%20on%20multimedia&rft.au=Yang,%20Yimin&rft.date=2018-04&rft.volume=20&rft.issue=4&rft.spage=1024&rft.epage=1032&rft.pages=1024-1032&rft.issn=1520-9210&rft.eissn=1941-0077&rft.coden=ITMUF8&rft_id=info:doi/10.1109/TMM.2017.2760623&rft_dat=%3Ccrossref_RIE%3E10_1109_TMM_2017_2760623%3C/crossref_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=8060571&rfr_iscdi=true