Scalable Algorithms for Missing Value Imputation

Statistical Imputation Techniques have been proposed mainly with the aim of predicting the missing values in the incomplete sets as an essential step in any data analysis framework. K-means-based Imputation, as a representative statistical imputation method, has been producing satisfied results in t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of computer applications 2014-01, Vol.87 (11), p.35-42
Hauptverfasser: Mohamed, Marghny H, Hashem, Abdel-rahiem A, Abdelsamea, Mohammed M
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 42
container_issue 11
container_start_page 35
container_title International journal of computer applications
container_volume 87
creator Mohamed, Marghny H
Hashem, Abdel-rahiem A
Abdelsamea, Mohammed M
description Statistical Imputation Techniques have been proposed mainly with the aim of predicting the missing values in the incomplete sets as an essential step in any data analysis framework. K-means-based Imputation, as a representative statistical imputation method, has been producing satisfied results in terms of effectiveness and efficiency in handling popular and freely available data set (e. g. , Bupa, Breast Cancer, Pima, etc. ). The main idea of K-means based methods is to impute the missing value relying on the prototypes of the representative class and the similarity of the data. However, such kinds of methods share the same limitations of the K-means as data mining technique. In this paper and motivated by such drawbacks, we introduce simple and efficient imputation methods based on K-means to deal with the missing data from various classes of data sets. Our proposed methods give higher accuracy than the one given by the standard K-means.
doi_str_mv 10.5120/15255-4019
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1520939512</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1520939512</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1739-b1b45b6c16aeaa26963d6bd1a6421c4f624630904c9fe98723734309d94e04ce3</originalsourceid><addsrcrecordid>eNpdUMtOwzAQtBBIVKUXviASF4QUsONX9lhVUCoVceBxtRzHKamcONjJgb_HpRwQe5nVaLQ7MwhdEnzLSYHvCC84zxkmcIJmGCTPy7KUp3_2c7SIcY_TUCgEsBnCL0Y7XTmbLd3Oh3b86GLW-JA9tTG2_S57126y2aYbplGPre8v0FmjXbSLX5yjt4f719Vjvn1eb1bLbW6IpJBXpGK8EoYIbbVOvwStRVUTLVhBDGtEwQTFgJmBxkIpCyopS0QNzCbS0jm6Pt4dgv-cbBxV10ZjndO99VNUKSwGCil4kl79k-79FPrkThEGICmH5GmObo4qE3yMwTZqCG2nw5ciWB36Uz_9qUN_9BvjAl66</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1499735917</pqid></control><display><type>article</type><title>Scalable Algorithms for Missing Value Imputation</title><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Mohamed, Marghny H ; Hashem, Abdel-rahiem A ; Abdelsamea, Mohammed M</creator><creatorcontrib>Mohamed, Marghny H ; Hashem, Abdel-rahiem A ; Abdelsamea, Mohammed M</creatorcontrib><description>Statistical Imputation Techniques have been proposed mainly with the aim of predicting the missing values in the incomplete sets as an essential step in any data analysis framework. K-means-based Imputation, as a representative statistical imputation method, has been producing satisfied results in terms of effectiveness and efficiency in handling popular and freely available data set (e. g. , Bupa, Breast Cancer, Pima, etc. ). The main idea of K-means based methods is to impute the missing value relying on the prototypes of the representative class and the similarity of the data. However, such kinds of methods share the same limitations of the K-means as data mining technique. In this paper and motivated by such drawbacks, we introduce simple and efficient imputation methods based on K-means to deal with the missing data from various classes of data sets. Our proposed methods give higher accuracy than the one given by the standard K-means.</description><identifier>ISSN: 0975-8887</identifier><identifier>EISSN: 0975-8887</identifier><identifier>DOI: 10.5120/15255-4019</identifier><language>eng</language><publisher>New York: Foundation of Computer Science</publisher><subject>Algorithms ; Cancer ; Data processing ; Handling ; Missing data ; Similarity</subject><ispartof>International journal of computer applications, 2014-01, Vol.87 (11), p.35-42</ispartof><rights>Copyright Foundation of Computer Science 2014</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c1739-b1b45b6c16aeaa26963d6bd1a6421c4f624630904c9fe98723734309d94e04ce3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Mohamed, Marghny H</creatorcontrib><creatorcontrib>Hashem, Abdel-rahiem A</creatorcontrib><creatorcontrib>Abdelsamea, Mohammed M</creatorcontrib><title>Scalable Algorithms for Missing Value Imputation</title><title>International journal of computer applications</title><description>Statistical Imputation Techniques have been proposed mainly with the aim of predicting the missing values in the incomplete sets as an essential step in any data analysis framework. K-means-based Imputation, as a representative statistical imputation method, has been producing satisfied results in terms of effectiveness and efficiency in handling popular and freely available data set (e. g. , Bupa, Breast Cancer, Pima, etc. ). The main idea of K-means based methods is to impute the missing value relying on the prototypes of the representative class and the similarity of the data. However, such kinds of methods share the same limitations of the K-means as data mining technique. In this paper and motivated by such drawbacks, we introduce simple and efficient imputation methods based on K-means to deal with the missing data from various classes of data sets. Our proposed methods give higher accuracy than the one given by the standard K-means.</description><subject>Algorithms</subject><subject>Cancer</subject><subject>Data processing</subject><subject>Handling</subject><subject>Missing data</subject><subject>Similarity</subject><issn>0975-8887</issn><issn>0975-8887</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><recordid>eNpdUMtOwzAQtBBIVKUXviASF4QUsONX9lhVUCoVceBxtRzHKamcONjJgb_HpRwQe5nVaLQ7MwhdEnzLSYHvCC84zxkmcIJmGCTPy7KUp3_2c7SIcY_TUCgEsBnCL0Y7XTmbLd3Oh3b86GLW-JA9tTG2_S57126y2aYbplGPre8v0FmjXbSLX5yjt4f719Vjvn1eb1bLbW6IpJBXpGK8EoYIbbVOvwStRVUTLVhBDGtEwQTFgJmBxkIpCyopS0QNzCbS0jm6Pt4dgv-cbBxV10ZjndO99VNUKSwGCil4kl79k-79FPrkThEGICmH5GmObo4qE3yMwTZqCG2nw5ciWB36Uz_9qUN_9BvjAl66</recordid><startdate>20140101</startdate><enddate>20140101</enddate><creator>Mohamed, Marghny H</creator><creator>Hashem, Abdel-rahiem A</creator><creator>Abdelsamea, Mohammed M</creator><general>Foundation of Computer Science</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20140101</creationdate><title>Scalable Algorithms for Missing Value Imputation</title><author>Mohamed, Marghny H ; Hashem, Abdel-rahiem A ; Abdelsamea, Mohammed M</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1739-b1b45b6c16aeaa26963d6bd1a6421c4f624630904c9fe98723734309d94e04ce3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Algorithms</topic><topic>Cancer</topic><topic>Data processing</topic><topic>Handling</topic><topic>Missing data</topic><topic>Similarity</topic><toplevel>online_resources</toplevel><creatorcontrib>Mohamed, Marghny H</creatorcontrib><creatorcontrib>Hashem, Abdel-rahiem A</creatorcontrib><creatorcontrib>Abdelsamea, Mohammed M</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>International journal of computer applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mohamed, Marghny H</au><au>Hashem, Abdel-rahiem A</au><au>Abdelsamea, Mohammed M</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Scalable Algorithms for Missing Value Imputation</atitle><jtitle>International journal of computer applications</jtitle><date>2014-01-01</date><risdate>2014</risdate><volume>87</volume><issue>11</issue><spage>35</spage><epage>42</epage><pages>35-42</pages><issn>0975-8887</issn><eissn>0975-8887</eissn><abstract>Statistical Imputation Techniques have been proposed mainly with the aim of predicting the missing values in the incomplete sets as an essential step in any data analysis framework. K-means-based Imputation, as a representative statistical imputation method, has been producing satisfied results in terms of effectiveness and efficiency in handling popular and freely available data set (e. g. , Bupa, Breast Cancer, Pima, etc. ). The main idea of K-means based methods is to impute the missing value relying on the prototypes of the representative class and the similarity of the data. However, such kinds of methods share the same limitations of the K-means as data mining technique. In this paper and motivated by such drawbacks, we introduce simple and efficient imputation methods based on K-means to deal with the missing data from various classes of data sets. Our proposed methods give higher accuracy than the one given by the standard K-means.</abstract><cop>New York</cop><pub>Foundation of Computer Science</pub><doi>10.5120/15255-4019</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0975-8887
ispartof International journal of computer applications, 2014-01, Vol.87 (11), p.35-42
issn 0975-8887
0975-8887
language eng
recordid cdi_proquest_miscellaneous_1520939512
source Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Algorithms
Cancer
Data processing
Handling
Missing data
Similarity
title Scalable Algorithms for Missing Value Imputation
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-13T10%3A44%3A52IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Scalable%20Algorithms%20for%20Missing%20Value%20Imputation&rft.jtitle=International%20journal%20of%20computer%20applications&rft.au=Mohamed,%20Marghny%20H&rft.date=2014-01-01&rft.volume=87&rft.issue=11&rft.spage=35&rft.epage=42&rft.pages=35-42&rft.issn=0975-8887&rft.eissn=0975-8887&rft_id=info:doi/10.5120/15255-4019&rft_dat=%3Cproquest_cross%3E1520939512%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1499735917&rft_id=info:pmid/&rfr_iscdi=true