Hadoop neural network for parallel and distributed feature selection

In this paper, we introduce a theoretical basis for a Hadoop-based neural network for parallel and distributed feature selection in Big Data sets. It is underpinned by an associative memory (binary) neural network which is highly amenable to parallel and distributed processing and fits with the Hado...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural networks 2016-06, Vol.78, p.24-35
Hauptverfasser:	Hodge, Victoria J., O’Keefe, Simon, Austin, Jim
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Associative memory Binary neural network Commonality Databases, Factual - statistics & numerical data Distributed Distributed processing Feature selection Flexibility Hadoop MapReduce Neural networks Neural Networks (Computer) Parallel Representations Selectors Statistics as Topic - methods
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	35
container_issue
container_start_page	24
container_title	Neural networks
container_volume	78
creator	Hodge, Victoria J. O’Keefe, Simon Austin, Jim
description	In this paper, we introduce a theoretical basis for a Hadoop-based neural network for parallel and distributed feature selection in Big Data sets. It is underpinned by an associative memory (binary) neural network which is highly amenable to parallel and distributed processing and fits with the Hadoop paradigm. There are many feature selectors described in the literature which all have various strengths and weaknesses. We present the implementation details of five feature selection algorithms constructed using our artificial neural network framework embedded in Hadoop YARN. Hadoop allows parallel and distributed processing. Each feature selector can be divided into subtasks and the subtasks can then be processed in parallel. Multiple feature selectors can also be processed simultaneously (in parallel) allowing multiple feature selectors to be compared. We identify commonalities among the five features selectors. All can be processed in the framework using a single representation and the overall processing can also be greatly reduced by only processing the common aspects of the feature selectors once and propagating these aspects across all five feature selectors as necessary. This allows the best feature selector and the actual features to select to be identified for large and high dimensional data sets through exploiting the efficiency and flexibility of embedding the binary associative-memory neural network in Hadoop.
doi_str_mv	10.1016/j.neunet.2015.08.011
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1825481626</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0893608015001744</els_id><sourcerecordid>1808674979</sourcerecordid><originalsourceid>FETCH-LOGICAL-c474t-b7f10d19eb21f5b40cbef06c794a1e88d7f449e4179840981032bd974e42d2e03</originalsourceid><addsrcrecordid>eNqNkcGKFDEQhoMo7uzqG4j00Uu3lXRNUrkIsuqusOBFzyHdqYaMPZ0x6VZ8e7PM6lE9_VB8VT_UJ8QLCZ0EqV8fuoW3hddOgdx3QB1I-UjsJBnbKkPqsdgB2b7VQHAhLks5AIAm7J-KC6URelK4E-9ufUjp1NRb2c811h8pf22mlJuTr5OZ58YvoQmxrDkO28qhmdivW-am8MzjGtPyTDyZ_Fz4-UNeiS8f3n--vm3vPt18vH57145ocG0HM0kI0vKg5LQfEMaBJ9CjseglEwUzIVpGaSwhWJLQqyFYg4wqKIb-Srw63z3l9G3jsrpjLCPPs184bcVJUnskqZX-DxRIG7TG_hs1ZAH3pGVF8YyOOZWSeXKnHI8-_3QS3L0Vd3BnK-7eigNy1Upde_nQsA1HDn-WfmuowJszwPV73yNnV8bIy8gh5vpiF1L8e8MvfOeeog</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1789045861</pqid></control><display><type>article</type><title>Hadoop neural network for parallel and distributed feature selection</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals Complete</source><creator>Hodge, Victoria J. ; O’Keefe, Simon ; Austin, Jim</creator><creatorcontrib>Hodge, Victoria J. ; O’Keefe, Simon ; Austin, Jim</creatorcontrib><description>In this paper, we introduce a theoretical basis for a Hadoop-based neural network for parallel and distributed feature selection in Big Data sets. It is underpinned by an associative memory (binary) neural network which is highly amenable to parallel and distributed processing and fits with the Hadoop paradigm. There are many feature selectors described in the literature which all have various strengths and weaknesses. We present the implementation details of five feature selection algorithms constructed using our artificial neural network framework embedded in Hadoop YARN. Hadoop allows parallel and distributed processing. Each feature selector can be divided into subtasks and the subtasks can then be processed in parallel. Multiple feature selectors can also be processed simultaneously (in parallel) allowing multiple feature selectors to be compared. We identify commonalities among the five features selectors. All can be processed in the framework using a single representation and the overall processing can also be greatly reduced by only processing the common aspects of the feature selectors once and propagating these aspects across all five feature selectors as necessary. This allows the best feature selector and the actual features to select to be identified for large and high dimensional data sets through exploiting the efficiency and flexibility of embedding the binary associative-memory neural network in Hadoop.</description><identifier>ISSN: 0893-6080</identifier><identifier>EISSN: 1879-2782</identifier><identifier>DOI: 10.1016/j.neunet.2015.08.011</identifier><identifier>PMID: 26403824</identifier><language>eng</language><publisher>United States: Elsevier Ltd</publisher><subject>Algorithms ; Associative memory ; Binary neural network ; Commonality ; Databases, Factual - statistics & numerical data ; Distributed ; Distributed processing ; Feature selection ; Flexibility ; Hadoop ; MapReduce ; Neural networks ; Neural Networks (Computer) ; Parallel ; Representations ; Selectors ; Statistics as Topic - methods</subject><ispartof>Neural networks, 2016-06, Vol.78, p.24-35</ispartof><rights>2015 The Authors</rights><rights>Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c474t-b7f10d19eb21f5b40cbef06c794a1e88d7f449e4179840981032bd974e42d2e03</citedby><cites>FETCH-LOGICAL-c474t-b7f10d19eb21f5b40cbef06c794a1e88d7f449e4179840981032bd974e42d2e03</cites><orcidid>0000-0002-2469-0224</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0893608015001744$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26403824$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Hodge, Victoria J.</creatorcontrib><creatorcontrib>O’Keefe, Simon</creatorcontrib><creatorcontrib>Austin, Jim</creatorcontrib><title>Hadoop neural network for parallel and distributed feature selection</title><title>Neural networks</title><addtitle>Neural Netw</addtitle><description>In this paper, we introduce a theoretical basis for a Hadoop-based neural network for parallel and distributed feature selection in Big Data sets. It is underpinned by an associative memory (binary) neural network which is highly amenable to parallel and distributed processing and fits with the Hadoop paradigm. There are many feature selectors described in the literature which all have various strengths and weaknesses. We present the implementation details of five feature selection algorithms constructed using our artificial neural network framework embedded in Hadoop YARN. Hadoop allows parallel and distributed processing. Each feature selector can be divided into subtasks and the subtasks can then be processed in parallel. Multiple feature selectors can also be processed simultaneously (in parallel) allowing multiple feature selectors to be compared. We identify commonalities among the five features selectors. All can be processed in the framework using a single representation and the overall processing can also be greatly reduced by only processing the common aspects of the feature selectors once and propagating these aspects across all five feature selectors as necessary. This allows the best feature selector and the actual features to select to be identified for large and high dimensional data sets through exploiting the efficiency and flexibility of embedding the binary associative-memory neural network in Hadoop.</description><subject>Algorithms</subject><subject>Associative memory</subject><subject>Binary neural network</subject><subject>Commonality</subject><subject>Databases, Factual - statistics & numerical data</subject><subject>Distributed</subject><subject>Distributed processing</subject><subject>Feature selection</subject><subject>Flexibility</subject><subject>Hadoop</subject><subject>MapReduce</subject><subject>Neural networks</subject><subject>Neural Networks (Computer)</subject><subject>Parallel</subject><subject>Representations</subject><subject>Selectors</subject><subject>Statistics as Topic - methods</subject><issn>0893-6080</issn><issn>1879-2782</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkcGKFDEQhoMo7uzqG4j00Uu3lXRNUrkIsuqusOBFzyHdqYaMPZ0x6VZ8e7PM6lE9_VB8VT_UJ8QLCZ0EqV8fuoW3hddOgdx3QB1I-UjsJBnbKkPqsdgB2b7VQHAhLks5AIAm7J-KC6URelK4E-9ufUjp1NRb2c811h8pf22mlJuTr5OZ58YvoQmxrDkO28qhmdivW-am8MzjGtPyTDyZ_Fz4-UNeiS8f3n--vm3vPt18vH57145ocG0HM0kI0vKg5LQfEMaBJ9CjseglEwUzIVpGaSwhWJLQqyFYg4wqKIb-Srw63z3l9G3jsrpjLCPPs184bcVJUnskqZX-DxRIG7TG_hs1ZAH3pGVF8YyOOZWSeXKnHI8-_3QS3L0Vd3BnK-7eigNy1Upde_nQsA1HDn-WfmuowJszwPV73yNnV8bIy8gh5vpiF1L8e8MvfOeeog</recordid><startdate>201606</startdate><enddate>201606</enddate><creator>Hodge, Victoria J.</creator><creator>O’Keefe, Simon</creator><creator>Austin, Jim</creator><general>Elsevier Ltd</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7QO</scope><scope>7TK</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>7SC</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-2469-0224</orcidid></search><sort><creationdate>201606</creationdate><title>Hadoop neural network for parallel and distributed feature selection</title><author>Hodge, Victoria J. ; O’Keefe, Simon ; Austin, Jim</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c474t-b7f10d19eb21f5b40cbef06c794a1e88d7f449e4179840981032bd974e42d2e03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Algorithms</topic><topic>Associative memory</topic><topic>Binary neural network</topic><topic>Commonality</topic><topic>Databases, Factual - statistics & numerical data</topic><topic>Distributed</topic><topic>Distributed processing</topic><topic>Feature selection</topic><topic>Flexibility</topic><topic>Hadoop</topic><topic>MapReduce</topic><topic>Neural networks</topic><topic>Neural Networks (Computer)</topic><topic>Parallel</topic><topic>Representations</topic><topic>Selectors</topic><topic>Statistics as Topic - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hodge, Victoria J.</creatorcontrib><creatorcontrib>O’Keefe, Simon</creatorcontrib><creatorcontrib>Austin, Jim</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Biotechnology Research Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Neural networks</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hodge, Victoria J.</au><au>O’Keefe, Simon</au><au>Austin, Jim</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hadoop neural network for parallel and distributed feature selection</atitle><jtitle>Neural networks</jtitle><addtitle>Neural Netw</addtitle><date>2016-06</date><risdate>2016</risdate><volume>78</volume><spage>24</spage><epage>35</epage><pages>24-35</pages><issn>0893-6080</issn><eissn>1879-2782</eissn><abstract>In this paper, we introduce a theoretical basis for a Hadoop-based neural network for parallel and distributed feature selection in Big Data sets. It is underpinned by an associative memory (binary) neural network which is highly amenable to parallel and distributed processing and fits with the Hadoop paradigm. There are many feature selectors described in the literature which all have various strengths and weaknesses. We present the implementation details of five feature selection algorithms constructed using our artificial neural network framework embedded in Hadoop YARN. Hadoop allows parallel and distributed processing. Each feature selector can be divided into subtasks and the subtasks can then be processed in parallel. Multiple feature selectors can also be processed simultaneously (in parallel) allowing multiple feature selectors to be compared. We identify commonalities among the five features selectors. All can be processed in the framework using a single representation and the overall processing can also be greatly reduced by only processing the common aspects of the feature selectors once and propagating these aspects across all five feature selectors as necessary. This allows the best feature selector and the actual features to select to be identified for large and high dimensional data sets through exploiting the efficiency and flexibility of embedding the binary associative-memory neural network in Hadoop.</abstract><cop>United States</cop><pub>Elsevier Ltd</pub><pmid>26403824</pmid><doi>10.1016/j.neunet.2015.08.011</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0002-2469-0224</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0893-6080
ispartof	Neural networks, 2016-06, Vol.78, p.24-35
issn	0893-6080 1879-2782
language	eng
recordid	cdi_proquest_miscellaneous_1825481626
source	MEDLINE; Elsevier ScienceDirect Journals Complete
subjects	Algorithms Associative memory Binary neural network Commonality Databases, Factual - statistics & numerical data Distributed Distributed processing Feature selection Flexibility Hadoop MapReduce Neural networks Neural Networks (Computer) Parallel Representations Selectors Statistics as Topic - methods
title	Hadoop neural network for parallel and distributed feature selection
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-11T23%3A53%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hadoop%20neural%20network%20for%20parallel%20and%20distributed%20feature%20selection&rft.jtitle=Neural%20networks&rft.au=Hodge,%20Victoria%20J.&rft.date=2016-06&rft.volume=78&rft.spage=24&rft.epage=35&rft.pages=24-35&rft.issn=0893-6080&rft.eissn=1879-2782&rft_id=info:doi/10.1016/j.neunet.2015.08.011&rft_dat=%3Cproquest_cross%3E1808674979%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1789045861&rft_id=info:pmid/26403824&rft_els_id=S0893608015001744&rfr_iscdi=true