Clustered linear regression

Clustered linear regression (CLR) is a new machine learning algorithm that improves the accuracy of classical linear regression by partitioning training space into subspaces. CLR makes some assumptions about the domain and the data set. Firstly, target value is assumed to be a function of feature va...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Knowledge-based systems 2002-03, Vol.15 (3), p.169-175
Hauptverfasser:	Ari, Bertan, Güvenir, H.Altay
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Clustering Linear regression Eager approach Machine learning Machine learning algorithm
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	175
container_issue	3
container_start_page	169
container_title	Knowledge-based systems
container_volume	15
creator	Ari, Bertan Güvenir, H.Altay
description	Clustered linear regression (CLR) is a new machine learning algorithm that improves the accuracy of classical linear regression by partitioning training space into subspaces. CLR makes some assumptions about the domain and the data set. Firstly, target value is assumed to be a function of feature values. Second assumption is that there are some linear approximations for this function in each subspace. Finally, there are enough training instances to determine subspaces and their linear approximations successfully. Tests indicate that if these approximations hold, CLR outperforms all other well-known machine-learning algorithms. Partitioning may continue until linear approximation fits all the instances in the training set — that generally occurs when the number of instances in the subspace is less than or equal to the number of features plus one. In other case, each new subspace will have a better fitting linear approximation. However, this will cause over fitting and gives less accurate results for the test instances. The stopping situation can be determined as no significant decrease or an increase in relative error. CLR uses a small portion of the training instances to determine the number of subspaces. The necessity of high number of training instances makes this algorithm suitable for data mining applications.
doi_str_mv	10.1016/S0950-7051(01)00154-X
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_57586590</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S095070510100154X</els_id><sourcerecordid>57586590</sourcerecordid><originalsourceid>FETCH-LOGICAL-c385t-dc3eba4d93878d2ded6fc2b163e4f77ba1911a5078530d1c5c2fc89d28dc1f063</originalsourceid><addsrcrecordid>eNqFkE1LxDAURYMoOI7-AhFmJbqIvpc2TboSKX7BgAsVZhfS5FUinXZMOsL8ezuOuBUevM25F-5h7AzhCgGL6xcoJXAFEi8ALwFQ5nyxxyaoleAqh3KfTf6QQ3aU0gcACIF6wk6rdp0GiuRnbejIxlmk90gphb47ZgeNbROd_P4pe7u_e60e-fz54am6nXOXaTlw7zKqbe7LTCvthSdfNE7UWGSUN0rVFktEK0FpmYFHJ51onC690N5hA0U2Zee73lXsP9eUBrMMyVHb2o76dTJSSV3IEkZQ7kAX-5QiNWYVw9LGjUEwWxXmR4XZ7jQw3laFWYy5m12OxhVfgaJJLlDnyIdIbjC-D_80fAPgJWUp</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>57586590</pqid></control><display><type>article</type><title>Clustered linear regression</title><source>ScienceDirect Freedom Collection (Elsevier)</source><creator>Ari, Bertan ; Güvenir, H.Altay</creator><creatorcontrib>Ari, Bertan ; Güvenir, H.Altay</creatorcontrib><description>Clustered linear regression (CLR) is a new machine learning algorithm that improves the accuracy of classical linear regression by partitioning training space into subspaces. CLR makes some assumptions about the domain and the data set. Firstly, target value is assumed to be a function of feature values. Second assumption is that there are some linear approximations for this function in each subspace. Finally, there are enough training instances to determine subspaces and their linear approximations successfully. Tests indicate that if these approximations hold, CLR outperforms all other well-known machine-learning algorithms. Partitioning may continue until linear approximation fits all the instances in the training set — that generally occurs when the number of instances in the subspace is less than or equal to the number of features plus one. In other case, each new subspace will have a better fitting linear approximation. However, this will cause over fitting and gives less accurate results for the test instances. The stopping situation can be determined as no significant decrease or an increase in relative error. CLR uses a small portion of the training instances to determine the number of subspaces. The necessity of high number of training instances makes this algorithm suitable for data mining applications.</description><identifier>ISSN: 0950-7051</identifier><identifier>EISSN: 1872-7409</identifier><identifier>DOI: 10.1016/S0950-7051(01)00154-X</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Algorithms ; Clustering Linear regression ; Eager approach ; Machine learning ; Machine learning algorithm</subject><ispartof>Knowledge-based systems, 2002-03, Vol.15 (3), p.169-175</ispartof><rights>2002 Elsevier Science B.V.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c385t-dc3eba4d93878d2ded6fc2b163e4f77ba1911a5078530d1c5c2fc89d28dc1f063</citedby><cites>FETCH-LOGICAL-c385t-dc3eba4d93878d2ded6fc2b163e4f77ba1911a5078530d1c5c2fc89d28dc1f063</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/S0950-7051(01)00154-X$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids></links><search><creatorcontrib>Ari, Bertan</creatorcontrib><creatorcontrib>Güvenir, H.Altay</creatorcontrib><title>Clustered linear regression</title><title>Knowledge-based systems</title><description>Clustered linear regression (CLR) is a new machine learning algorithm that improves the accuracy of classical linear regression by partitioning training space into subspaces. CLR makes some assumptions about the domain and the data set. Firstly, target value is assumed to be a function of feature values. Second assumption is that there are some linear approximations for this function in each subspace. Finally, there are enough training instances to determine subspaces and their linear approximations successfully. Tests indicate that if these approximations hold, CLR outperforms all other well-known machine-learning algorithms. Partitioning may continue until linear approximation fits all the instances in the training set — that generally occurs when the number of instances in the subspace is less than or equal to the number of features plus one. In other case, each new subspace will have a better fitting linear approximation. However, this will cause over fitting and gives less accurate results for the test instances. The stopping situation can be determined as no significant decrease or an increase in relative error. CLR uses a small portion of the training instances to determine the number of subspaces. The necessity of high number of training instances makes this algorithm suitable for data mining applications.</description><subject>Algorithms</subject><subject>Clustering Linear regression</subject><subject>Eager approach</subject><subject>Machine learning</subject><subject>Machine learning algorithm</subject><issn>0950-7051</issn><issn>1872-7409</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2002</creationdate><recordtype>article</recordtype><recordid>eNqFkE1LxDAURYMoOI7-AhFmJbqIvpc2TboSKX7BgAsVZhfS5FUinXZMOsL8ezuOuBUevM25F-5h7AzhCgGL6xcoJXAFEi8ALwFQ5nyxxyaoleAqh3KfTf6QQ3aU0gcACIF6wk6rdp0GiuRnbejIxlmk90gphb47ZgeNbROd_P4pe7u_e60e-fz54am6nXOXaTlw7zKqbe7LTCvthSdfNE7UWGSUN0rVFktEK0FpmYFHJ51onC690N5hA0U2Zee73lXsP9eUBrMMyVHb2o76dTJSSV3IEkZQ7kAX-5QiNWYVw9LGjUEwWxXmR4XZ7jQw3laFWYy5m12OxhVfgaJJLlDnyIdIbjC-D_80fAPgJWUp</recordid><startdate>20020301</startdate><enddate>20020301</enddate><creator>Ari, Bertan</creator><creator>Güvenir, H.Altay</creator><general>Elsevier B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>E3H</scope><scope>F2A</scope></search><sort><creationdate>20020301</creationdate><title>Clustered linear regression</title><author>Ari, Bertan ; Güvenir, H.Altay</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c385t-dc3eba4d93878d2ded6fc2b163e4f77ba1911a5078530d1c5c2fc89d28dc1f063</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Algorithms</topic><topic>Clustering Linear regression</topic><topic>Eager approach</topic><topic>Machine learning</topic><topic>Machine learning algorithm</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ari, Bertan</creatorcontrib><creatorcontrib>Güvenir, H.Altay</creatorcontrib><collection>CrossRef</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><jtitle>Knowledge-based systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ari, Bertan</au><au>Güvenir, H.Altay</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Clustered linear regression</atitle><jtitle>Knowledge-based systems</jtitle><date>2002-03-01</date><risdate>2002</risdate><volume>15</volume><issue>3</issue><spage>169</spage><epage>175</epage><pages>169-175</pages><issn>0950-7051</issn><eissn>1872-7409</eissn><abstract>Clustered linear regression (CLR) is a new machine learning algorithm that improves the accuracy of classical linear regression by partitioning training space into subspaces. CLR makes some assumptions about the domain and the data set. Firstly, target value is assumed to be a function of feature values. Second assumption is that there are some linear approximations for this function in each subspace. Finally, there are enough training instances to determine subspaces and their linear approximations successfully. Tests indicate that if these approximations hold, CLR outperforms all other well-known machine-learning algorithms. Partitioning may continue until linear approximation fits all the instances in the training set — that generally occurs when the number of instances in the subspace is less than or equal to the number of features plus one. In other case, each new subspace will have a better fitting linear approximation. However, this will cause over fitting and gives less accurate results for the test instances. The stopping situation can be determined as no significant decrease or an increase in relative error. CLR uses a small portion of the training instances to determine the number of subspaces. The necessity of high number of training instances makes this algorithm suitable for data mining applications.</abstract><pub>Elsevier B.V</pub><doi>10.1016/S0950-7051(01)00154-X</doi><tpages>7</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0950-7051
ispartof	Knowledge-based systems, 2002-03, Vol.15 (3), p.169-175
issn	0950-7051 1872-7409
language	eng
recordid	cdi_proquest_miscellaneous_57586590
source	ScienceDirect Freedom Collection (Elsevier)
subjects	Algorithms Clustering Linear regression Eager approach Machine learning Machine learning algorithm
title	Clustered linear regression
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T08%3A55%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Clustered%20linear%20regression&rft.jtitle=Knowledge-based%20systems&rft.au=Ari,%20Bertan&rft.date=2002-03-01&rft.volume=15&rft.issue=3&rft.spage=169&rft.epage=175&rft.pages=169-175&rft.issn=0950-7051&rft.eissn=1872-7409&rft_id=info:doi/10.1016/S0950-7051(01)00154-X&rft_dat=%3Cproquest_cross%3E57586590%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=57586590&rft_id=info:pmid/&rft_els_id=S095070510100154X&rfr_iscdi=true