Orthogonal Random Features

We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Yu, Felix X, Suresh, Ananda Theertha, Choromanski, Krzysztof, Holtmann-Rice, Daniel, Kumar, Sanjiv
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Yu, Felix X
Suresh, Ananda Theertha
Choromanski, Krzysztof
Holtmann-Rice, Daniel
Kumar, Sanjiv
description We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and provide theoretical and empirical justification for this behavior. Motivated by this discovery, we further propose Structured Orthogonal Random Features (SORF), which uses a class of structured discrete orthogonal matrices to speed up the computation. The method reduces the time cost from $\mathcal{O}(d^2)$ to $\mathcal{O}(d \log d)$, where $d$ is the data dimensionality, with almost no compromise in kernel approximation quality compared to ORF. Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods. We also provide discussions on using the same type of discrete orthogonal structure for a broader range of applications.
doi_str_mv 10.48550/arxiv.1610.09072
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_1610_09072</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1610_09072</sourcerecordid><originalsourceid>FETCH-LOGICAL-a672-bffdaac959592c5b53a3ac3875649d4af1ab84f6ef747d5d65642054b408512c3</originalsourceid><addsrcrecordid>eNotjr0KwjAURrM4iPoAuugLVNMkN2lHKVYFoSDdy03TaKE_Eqvo21urfMMHZzgcQuY-XYsAgG7Qvcrn2pc9oCFVbEwWieuu7aVtsFqdsTFtvYoL7B6uuE_JyGJ1L2b_n5A03qXRwTsl-2O0PXkoFfO0tQYxD6Efy0EDR445DxRIERqB1kcdCCsLq4QyYGTPGQWhBQ3AZzmfkOVPO8RlN1fW6N7ZNzIbIvkHJKs1ng</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Orthogonal Random Features</title><source>arXiv.org</source><creator>Yu, Felix X ; Suresh, Ananda Theertha ; Choromanski, Krzysztof ; Holtmann-Rice, Daniel ; Kumar, Sanjiv</creator><creatorcontrib>Yu, Felix X ; Suresh, Ananda Theertha ; Choromanski, Krzysztof ; Holtmann-Rice, Daniel ; Kumar, Sanjiv</creatorcontrib><description>We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and provide theoretical and empirical justification for this behavior. Motivated by this discovery, we further propose Structured Orthogonal Random Features (SORF), which uses a class of structured discrete orthogonal matrices to speed up the computation. The method reduces the time cost from $\mathcal{O}(d^2)$ to $\mathcal{O}(d \log d)$, where $d$ is the data dimensionality, with almost no compromise in kernel approximation quality compared to ORF. Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods. We also provide discussions on using the same type of discrete orthogonal structure for a broader range of applications.</description><identifier>DOI: 10.48550/arxiv.1610.09072</identifier><language>eng</language><subject>Computer Science - Learning ; Statistics - Machine Learning</subject><creationdate>2016-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,778,883</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/1610.09072$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.1610.09072$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Yu, Felix X</creatorcontrib><creatorcontrib>Suresh, Ananda Theertha</creatorcontrib><creatorcontrib>Choromanski, Krzysztof</creatorcontrib><creatorcontrib>Holtmann-Rice, Daniel</creatorcontrib><creatorcontrib>Kumar, Sanjiv</creatorcontrib><title>Orthogonal Random Features</title><description>We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and provide theoretical and empirical justification for this behavior. Motivated by this discovery, we further propose Structured Orthogonal Random Features (SORF), which uses a class of structured discrete orthogonal matrices to speed up the computation. The method reduces the time cost from $\mathcal{O}(d^2)$ to $\mathcal{O}(d \log d)$, where $d$ is the data dimensionality, with almost no compromise in kernel approximation quality compared to ORF. Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods. We also provide discussions on using the same type of discrete orthogonal structure for a broader range of applications.</description><subject>Computer Science - Learning</subject><subject>Statistics - Machine Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotjr0KwjAURrM4iPoAuugLVNMkN2lHKVYFoSDdy03TaKE_Eqvo21urfMMHZzgcQuY-XYsAgG7Qvcrn2pc9oCFVbEwWieuu7aVtsFqdsTFtvYoL7B6uuE_JyGJ1L2b_n5A03qXRwTsl-2O0PXkoFfO0tQYxD6Efy0EDR445DxRIERqB1kcdCCsLq4QyYGTPGQWhBQ3AZzmfkOVPO8RlN1fW6N7ZNzIbIvkHJKs1ng</recordid><startdate>20161027</startdate><enddate>20161027</enddate><creator>Yu, Felix X</creator><creator>Suresh, Ananda Theertha</creator><creator>Choromanski, Krzysztof</creator><creator>Holtmann-Rice, Daniel</creator><creator>Kumar, Sanjiv</creator><scope>AKY</scope><scope>EPD</scope><scope>GOX</scope></search><sort><creationdate>20161027</creationdate><title>Orthogonal Random Features</title><author>Yu, Felix X ; Suresh, Ananda Theertha ; Choromanski, Krzysztof ; Holtmann-Rice, Daniel ; Kumar, Sanjiv</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a672-bffdaac959592c5b53a3ac3875649d4af1ab84f6ef747d5d65642054b408512c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Computer Science - Learning</topic><topic>Statistics - Machine Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Yu, Felix X</creatorcontrib><creatorcontrib>Suresh, Ananda Theertha</creatorcontrib><creatorcontrib>Choromanski, Krzysztof</creatorcontrib><creatorcontrib>Holtmann-Rice, Daniel</creatorcontrib><creatorcontrib>Kumar, Sanjiv</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv Statistics</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yu, Felix X</au><au>Suresh, Ananda Theertha</au><au>Choromanski, Krzysztof</au><au>Holtmann-Rice, Daniel</au><au>Kumar, Sanjiv</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Orthogonal Random Features</atitle><date>2016-10-27</date><risdate>2016</risdate><abstract>We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and provide theoretical and empirical justification for this behavior. Motivated by this discovery, we further propose Structured Orthogonal Random Features (SORF), which uses a class of structured discrete orthogonal matrices to speed up the computation. The method reduces the time cost from $\mathcal{O}(d^2)$ to $\mathcal{O}(d \log d)$, where $d$ is the data dimensionality, with almost no compromise in kernel approximation quality compared to ORF. Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods. We also provide discussions on using the same type of discrete orthogonal structure for a broader range of applications.</abstract><doi>10.48550/arxiv.1610.09072</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.1610.09072
ispartof
issn
language eng
recordid cdi_arxiv_primary_1610_09072
source arXiv.org
subjects Computer Science - Learning
Statistics - Machine Learning
title Orthogonal Random Features
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T18%3A37%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Orthogonal%20Random%20Features&rft.au=Yu,%20Felix%20X&rft.date=2016-10-27&rft_id=info:doi/10.48550/arxiv.1610.09072&rft_dat=%3Carxiv_GOX%3E1610_09072%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true