Orthogonal Random Features

We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yu, Felix X, Suresh, Ananda Theertha, Choromanski, Krzysztof, Holtmann-Rice, Daniel, Kumar, Sanjiv
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Learning Statistics - Machine Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Yu, Felix X Suresh, Ananda Theertha Choromanski, Krzysztof Holtmann-Rice, Daniel Kumar, Sanjiv
description	We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and provide theoretical and empirical justification for this behavior. Motivated by this discovery, we further propose Structured Orthogonal Random Features (SORF), which uses a class of structured discrete orthogonal matrices to speed up the computation. The method reduces the time cost from $\mathcal{O}(d^2)$ to $\mathcal{O}(d \log d)$, where $d$ is the data dimensionality, with almost no compromise in kernel approximation quality compared to ORF. Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods. We also provide discussions on using the same type of discrete orthogonal structure for a broader range of applications.
doi_str_mv	10.48550/arxiv.1610.09072
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1610_09072</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1610_09072</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-bffdaac959592c5b53a3ac3875649d4af1ab84f6ef747d5d65642054b408512c3</originalsourceid><addsrcrecordid>eNotjr0KwjAURrM4iPoAuugLVNMkN2lHKVYFoSDdy03TaKE_Eqvo21urfMMHZzgcQuY-XYsAgG7Qvcrn2pc9oCFVbEwWieuu7aVtsFqdsTFtvYoL7B6uuE_JyGJ1L2b_n5A03qXRwTsl-2O0PXkoFfO0tQYxD6Efy0EDR445DxRIERqB1kcdCCsLq4QyYGTPGQWhBQ3AZzmfkOVPO8RlN1fW6N7ZNzIbIvkHJKs1ng</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Orthogonal Random Features</title><source>arXiv.org</source><creator>Yu, Felix X ; Suresh, Ananda Theertha ; Choromanski, Krzysztof ; Holtmann-Rice, Daniel ; Kumar, Sanjiv</creator><creatorcontrib>Yu, Felix X ; Suresh, Ananda Theertha ; Choromanski, Krzysztof ; Holtmann-Rice, Daniel ; Kumar, Sanjiv</creatorcontrib><description>We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and provide theoretical and empirical justification for this behavior. Motivated by this discovery, we further propose Structured Orthogonal Random Features (SORF), which uses a class of structured discrete orthogonal matrices to speed up the computation. The method reduces the time cost from $\mathcal{O}(d^2)$ to $\mathcal{O}(d \log d)$, where $d$ is the data dimensionality, with almost no compromise in kernel approximation quality compared to ORF. Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods. We also provide discussions on using the same type of discrete orthogonal structure for a broader range of applications.</description><identifier>DOI: 10.48550/arxiv.1610.09072</identifier><language>eng</language><subject>Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2016-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,778,883</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1610.09072$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1610.09072$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Yu, Felix X</creatorcontrib><creatorcontrib>Suresh, Ananda Theertha</creatorcontrib><creatorcontrib>Choromanski, Krzysztof</creatorcontrib><creatorcontrib>Holtmann-Rice, Daniel</creatorcontrib><creatorcontrib>Kumar, Sanjiv</creatorcontrib><title>Orthogonal Random Features</title><description>We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and provide theoretical and empirical justification for this behavior. Motivated by this discovery, we further propose Structured Orthogonal Random Features (SORF), which uses a class of structured discrete orthogonal matrices to speed up the computation. The method reduces the time cost from $\mathcal{O}(d^2)$ to $\mathcal{O}(d \log d)$, where $d$ is the data dimensionality, with almost no compromise in kernel approximation quality compared to ORF. Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods. We also provide discussions on using the same type of discrete orthogonal structure for a broader range of applications.</description><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotjr0KwjAURrM4iPoAuugLVNMkN2lHKVYFoSDdy03TaKE_Eqvo21urfMMHZzgcQuY-XYsAgG7Qvcrn2pc9oCFVbEwWieuu7aVtsFqdsTFtvYoL7B6uuE_JyGJ1L2b_n5A03qXRwTsl-2O0PXkoFfO0tQYxD6Efy0EDR445DxRIERqB1kcdCCsLq4QyYGTPGQWhBQ3AZzmfkOVPO8RlN1fW6N7ZNzIbIvkHJKs1ng</recordid><startdate>20161027</startdate><enddate>20161027</enddate><creator>Yu, Felix X</creator><creator>Suresh, Ananda Theertha</creator><creator>Choromanski, Krzysztof</creator><creator>Holtmann-Rice, Daniel</creator><creator>Kumar, Sanjiv</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20161027</creationdate><title>Orthogonal Random Features</title><author>Yu, Felix X ; Suresh, Ananda Theertha ; Choromanski, Krzysztof ; Holtmann-Rice, Daniel ; Kumar, Sanjiv</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-bffdaac959592c5b53a3ac3875649d4af1ab84f6ef747d5d65642054b408512c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Yu, Felix X</creatorcontrib><creatorcontrib>Suresh, Ananda Theertha</creatorcontrib><creatorcontrib>Choromanski, Krzysztof</creatorcontrib><creatorcontrib>Holtmann-Rice, Daniel</creatorcontrib><creatorcontrib>Kumar, Sanjiv</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yu, Felix X</au><au>Suresh, Ananda Theertha</au><au>Choromanski, Krzysztof</au><au>Holtmann-Rice, Daniel</au><au>Kumar, Sanjiv</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Orthogonal Random Features</atitle><date>2016-10-27</date><risdate>2016</risdate><abstract>We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and provide theoretical and empirical justification for this behavior. Motivated by this discovery, we further propose Structured Orthogonal Random Features (SORF), which uses a class of structured discrete orthogonal matrices to speed up the computation. The method reduces the time cost from $\mathcal{O}(d^2)$ to $\mathcal{O}(d \log d)$, where $d$ is the data dimensionality, with almost no compromise in kernel approximation quality compared to ORF. Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods. We also provide discussions on using the same type of discrete orthogonal structure for a broader range of applications.</abstract><doi>10.48550/arxiv.1610.09072</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.1610.09072
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_1610_09072
source	arXiv.org
subjects	Computer Science - Learning Statistics - Machine Learning
title	Orthogonal Random Features
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T18%3A37%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Orthogonal%20Random%20Features&rft.au=Yu,%20Felix%20X&rft.date=2016-10-27&rft_id=info:doi/10.48550/arxiv.1610.09072&rft_dat=%3Carxiv_GOX%3E1610_09072%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true