High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso

The goal of supervised feature selection is to find a subset of input features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient feature selection based on linear dependency between input features and outpu...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural computation 2014-01, Vol.26 (1), p.185-207
Hauptverfasser:	Yamada, Makoto, Jitkrittum, Wittawat, Sigal, Leonid, Xing, Eric P., Sugiyama, Masashi
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Animals Artificial Intelligence Computer programming Experiments Letters Nonlinear Dynamics Oligonucleotide Array Sequence Analysis Pattern Recognition, Automated - methods Rats Regression analysis
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	207
container_issue	1
container_start_page	185
container_title	Neural computation
container_volume	26
creator	Yamada, Makoto Jitkrittum, Wittawat Sigal, Leonid Xing, Eric P. Sugiyama, Masashi
description	The goal of supervised feature selection is to find a subset of input features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient feature selection based on linear dependency between input features and output values. In this letter, we consider a feature-wise kernelized Lasso for capturing nonlinear input-output dependency. We first show that with particular choices of kernel functions, nonredundant features with strong statistical dependence on output values can be found in terms of kernel-based independence measures such as the Hilbert-Schmidt independence criterion. We then show that the globally optimal solution can be efficiently computed; this makes the approach scalable to high-dimensional problems. The effectiveness of the proposed method is demonstrated through feature selection experiments for classification and regression with thousands of features.
doi_str_mv	10.1162/NECO_a_00537
format	Article
fullrecord	<record><control><sourceid>proquest_mit_j</sourceid><recordid>TN_cdi_mit_journals_10_1162_NECO_a_00537</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1465864602</sourcerecordid><originalsourceid>FETCH-LOGICAL-c461t-1cd225679c54a71dc36adb5f7ecec2e42cf1a21e09a089a9cdd4444d9b134b563</originalsourceid><addsrcrecordid>eNptkL1PwzAQxS0EoqWwMaNILAwEbMcfyYhCSxEVHQDBZjnOBVwlTbGTofz1BNpChbjlpKef3r17CB0TfEGIoJf3w3SqtMKYR3IH9QmPcBjH8csu6uM4SUIphOyhA-9nGGNBMN9HPcoIpoSKPkrH9vUtvLYVzL2t57oMRqCb1kHwACWYptOCbLkRw2frIbgDN4fSfkAeTLT39SHaK3Tp4Wi9B-hpNHxMx-FkenObXk1CwwRpQmJySrmQieFMS5KbSOg844UEA4YCo6YgmhLAie6C68TkOesmTzISsYyLaIDOVr4LV7-34BtVWW-gLPUc6tYrwgSPBROYdujpH3RWt65775uSiYwTyTrqfEUZV3vvoFALZyvtlopg9VWu2i63w0_Wpm1WQf4Db9r8DVjZ7YP_eX0CbNeAtA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1467978974</pqid></control><display><type>article</type><title>High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso</title><source>MEDLINE</source><source>MIT Press Journals</source><creator>Yamada, Makoto ; Jitkrittum, Wittawat ; Sigal, Leonid ; Xing, Eric P. ; Sugiyama, Masashi</creator><creatorcontrib>Yamada, Makoto ; Jitkrittum, Wittawat ; Sigal, Leonid ; Xing, Eric P. ; Sugiyama, Masashi</creatorcontrib><description>The goal of supervised feature selection is to find a subset of input features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient feature selection based on linear dependency between input features and output values. In this letter, we consider a feature-wise kernelized Lasso for capturing nonlinear input-output dependency. We first show that with particular choices of kernel functions, nonredundant features with strong statistical dependence on output values can be found in terms of kernel-based independence measures such as the Hilbert-Schmidt independence criterion. We then show that the globally optimal solution can be efficiently computed; this makes the approach scalable to high-dimensional problems. The effectiveness of the proposed method is demonstrated through feature selection experiments for classification and regression with thousands of features.</description><identifier>ISSN: 0899-7667</identifier><identifier>EISSN: 1530-888X</identifier><identifier>DOI: 10.1162/NECO_a_00537</identifier><identifier>PMID: 24102126</identifier><identifier>CODEN: NEUCEB</identifier><language>eng</language><publisher>One Rogers Street, Cambridge, MA 02142-1209, USA: MIT Press</publisher><subject>Algorithms ; Animals ; Artificial Intelligence ; Computer programming ; Experiments ; Letters ; Nonlinear Dynamics ; Oligonucleotide Array Sequence Analysis ; Pattern Recognition, Automated - methods ; Rats ; Regression analysis</subject><ispartof>Neural computation, 2014-01, Vol.26 (1), p.185-207</ispartof><rights>Copyright MIT Press Journals Jan 2014</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c461t-1cd225679c54a71dc36adb5f7ecec2e42cf1a21e09a089a9cdd4444d9b134b563</citedby><cites>FETCH-LOGICAL-c461t-1cd225679c54a71dc36adb5f7ecec2e42cf1a21e09a089a9cdd4444d9b134b563</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://direct.mit.edu/neco/article/doi/10.1162/NECO_a_00537$$EHTML$$P50$$Gmit$$H</linktohtml><link.rule.ids>314,780,784,27923,27924,54008,54009</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/24102126$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Yamada, Makoto</creatorcontrib><creatorcontrib>Jitkrittum, Wittawat</creatorcontrib><creatorcontrib>Sigal, Leonid</creatorcontrib><creatorcontrib>Xing, Eric P.</creatorcontrib><creatorcontrib>Sugiyama, Masashi</creatorcontrib><title>High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso</title><title>Neural computation</title><addtitle>Neural Comput</addtitle><description>The goal of supervised feature selection is to find a subset of input features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient feature selection based on linear dependency between input features and output values. In this letter, we consider a feature-wise kernelized Lasso for capturing nonlinear input-output dependency. We first show that with particular choices of kernel functions, nonredundant features with strong statistical dependence on output values can be found in terms of kernel-based independence measures such as the Hilbert-Schmidt independence criterion. We then show that the globally optimal solution can be efficiently computed; this makes the approach scalable to high-dimensional problems. The effectiveness of the proposed method is demonstrated through feature selection experiments for classification and regression with thousands of features.</description><subject>Algorithms</subject><subject>Animals</subject><subject>Artificial Intelligence</subject><subject>Computer programming</subject><subject>Experiments</subject><subject>Letters</subject><subject>Nonlinear Dynamics</subject><subject>Oligonucleotide Array Sequence Analysis</subject><subject>Pattern Recognition, Automated - methods</subject><subject>Rats</subject><subject>Regression analysis</subject><issn>0899-7667</issn><issn>1530-888X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNptkL1PwzAQxS0EoqWwMaNILAwEbMcfyYhCSxEVHQDBZjnOBVwlTbGTofz1BNpChbjlpKef3r17CB0TfEGIoJf3w3SqtMKYR3IH9QmPcBjH8csu6uM4SUIphOyhA-9nGGNBMN9HPcoIpoSKPkrH9vUtvLYVzL2t57oMRqCb1kHwACWYptOCbLkRw2frIbgDN4fSfkAeTLT39SHaK3Tp4Wi9B-hpNHxMx-FkenObXk1CwwRpQmJySrmQieFMS5KbSOg844UEA4YCo6YgmhLAie6C68TkOesmTzISsYyLaIDOVr4LV7-34BtVWW-gLPUc6tYrwgSPBROYdujpH3RWt65775uSiYwTyTrqfEUZV3vvoFALZyvtlopg9VWu2i63w0_Wpm1WQf4Db9r8DVjZ7YP_eX0CbNeAtA</recordid><startdate>20140101</startdate><enddate>20140101</enddate><creator>Yamada, Makoto</creator><creator>Jitkrittum, Wittawat</creator><creator>Sigal, Leonid</creator><creator>Xing, Eric P.</creator><creator>Sugiyama, Masashi</creator><general>MIT Press</general><general>MIT Press Journals, The</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>20140101</creationdate><title>High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso</title><author>Yamada, Makoto ; Jitkrittum, Wittawat ; Sigal, Leonid ; Xing, Eric P. ; Sugiyama, Masashi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c461t-1cd225679c54a71dc36adb5f7ecec2e42cf1a21e09a089a9cdd4444d9b134b563</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Algorithms</topic><topic>Animals</topic><topic>Artificial Intelligence</topic><topic>Computer programming</topic><topic>Experiments</topic><topic>Letters</topic><topic>Nonlinear Dynamics</topic><topic>Oligonucleotide Array Sequence Analysis</topic><topic>Pattern Recognition, Automated - methods</topic><topic>Rats</topic><topic>Regression analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yamada, Makoto</creatorcontrib><creatorcontrib>Jitkrittum, Wittawat</creatorcontrib><creatorcontrib>Sigal, Leonid</creatorcontrib><creatorcontrib>Xing, Eric P.</creatorcontrib><creatorcontrib>Sugiyama, Masashi</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>Neural computation</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yamada, Makoto</au><au>Jitkrittum, Wittawat</au><au>Sigal, Leonid</au><au>Xing, Eric P.</au><au>Sugiyama, Masashi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso</atitle><jtitle>Neural computation</jtitle><addtitle>Neural Comput</addtitle><date>2014-01-01</date><risdate>2014</risdate><volume>26</volume><issue>1</issue><spage>185</spage><epage>207</epage><pages>185-207</pages><issn>0899-7667</issn><eissn>1530-888X</eissn><coden>NEUCEB</coden><abstract>The goal of supervised feature selection is to find a subset of input features that are responsible for predicting output values. The least absolute shrinkage and selection operator (Lasso) allows computationally efficient feature selection based on linear dependency between input features and output values. In this letter, we consider a feature-wise kernelized Lasso for capturing nonlinear input-output dependency. We first show that with particular choices of kernel functions, nonredundant features with strong statistical dependence on output values can be found in terms of kernel-based independence measures such as the Hilbert-Schmidt independence criterion. We then show that the globally optimal solution can be efficiently computed; this makes the approach scalable to high-dimensional problems. The effectiveness of the proposed method is demonstrated through feature selection experiments for classification and regression with thousands of features.</abstract><cop>One Rogers Street, Cambridge, MA 02142-1209, USA</cop><pub>MIT Press</pub><pmid>24102126</pmid><doi>10.1162/NECO_a_00537</doi><tpages>23</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0899-7667
ispartof	Neural computation, 2014-01, Vol.26 (1), p.185-207
issn	0899-7667 1530-888X
language	eng
recordid	cdi_mit_journals_10_1162_NECO_a_00537
source	MEDLINE; MIT Press Journals
subjects	Algorithms Animals Artificial Intelligence Computer programming Experiments Letters Nonlinear Dynamics Oligonucleotide Array Sequence Analysis Pattern Recognition, Automated - methods Rats Regression analysis
title	High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T09%3A10%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_mit_j&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=High-Dimensional%20Feature%20Selection%20by%20Feature-Wise%20Kernelized%20Lasso&rft.jtitle=Neural%20computation&rft.au=Yamada,%20Makoto&rft.date=2014-01-01&rft.volume=26&rft.issue=1&rft.spage=185&rft.epage=207&rft.pages=185-207&rft.issn=0899-7667&rft.eissn=1530-888X&rft.coden=NEUCEB&rft_id=info:doi/10.1162/NECO_a_00537&rft_dat=%3Cproquest_mit_j%3E1465864602%3C/proquest_mit_j%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1467978974&rft_id=info:pmid/24102126&rfr_iscdi=true