A Survey on Data Imputation Techniques: Water Distribution System as a Use Case
The presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting th...
Gespeichert in:
Veröffentlicht in: | IEEE access 2018, Vol.6, p.63279-63291 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 63291 |
---|---|
container_issue | |
container_start_page | 63279 |
container_title | IEEE access |
container_volume | 6 |
creator | Osman, Muhammad S. Abu-Mahfouz, Adnan M. Page, Philip R. |
description | The presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting the data to using complex algorithms to impute missing data. This paper reviews the different imputation options available from traditional methods (such as deletion and single imputation) to more modern and advanced methods (such as multiple imputation, model-based procedures, and machine learning techniques). The concept, application, and qualitative advantages and disadvantages of these methods are discussed. In addition, a novel approach for selecting an applicable technique is presented. The approach is a "top-down bottom-up" two-prong approach for the selection of a data analysis and missing data technique. The bottom-up approach facilitates the top-down selection of a suitable technique by analyzing the data and narrowing down the selection options. As a use case, this paper also reviews techniques that are used to impute missing data in WDSs. |
doi_str_mv | 10.1109/ACCESS.2018.2877269 |
format | Article |
fullrecord | <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_8502041</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8502041</ieee_id><doaj_id>oai_doaj_org_article_aa57e25562b24324bae82834ae2d2e97</doaj_id><sourcerecordid>2455899650</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-192eeb04514284b0ef16e9efe06928b1b44f24852f1102e3a1e5161a947d89a13</originalsourceid><addsrcrecordid>eNpNUU1rwkAQDaWFivUXeFnoWbs72U12e5NoW0HwoNLjMtFJG1FjdzcF_32jEelc5vO9meFFUV_woRDcvIyybLJYDIELPQSdppCYu6gDIjGDWMXJ_b_4Mep5v-WN6aak0k40H7FF7X7pxKoDG2NANt0f64ChbPIlrb8P5U9N_pV9YiDHxqUPrszrS3tx8oH2DD1DtvLEMvT0FD0UuPPUu_putHqbLLOPwWz-Ps1Gs8Fach0GwgBRzqUSErTMORUiIUMF8cSAzkUuZQFSKyiaF4FiFKREItDIdKMNirgbTVveTYVbe3TlHt3JVljaS6FyXxZdKNc7sogqJVAqgRxkDDJH0qBjiQQbIJM2XM8t19FV52eD3Va1OzTnW5BKaWMSxZupuJ1au8p7R8Vtq-D2LIRthbBnIexViAbVb1ElEd0QWnHgUsR_za2Bnw</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2455899650</pqid></control><display><type>article</type><title>A Survey on Data Imputation Techniques: Water Distribution System as a Use Case</title><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>IEEE Xplore Open Access Journals</source><creator>Osman, Muhammad S. ; Abu-Mahfouz, Adnan M. ; Page, Philip R.</creator><creatorcontrib>Osman, Muhammad S. ; Abu-Mahfouz, Adnan M. ; Page, Philip R.</creatorcontrib><description>The presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting the data to using complex algorithms to impute missing data. This paper reviews the different imputation options available from traditional methods (such as deletion and single imputation) to more modern and advanced methods (such as multiple imputation, model-based procedures, and machine learning techniques). The concept, application, and qualitative advantages and disadvantages of these methods are discussed. In addition, a novel approach for selecting an applicable technique is presented. The approach is a "top-down bottom-up" two-prong approach for the selection of a data analysis and missing data technique. The bottom-up approach facilitates the top-down selection of a suitable technique by analyzing the data and narrowing down the selection options. As a use case, this paper also reviews techniques that are used to impute missing data in WDSs.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2018.2877269</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Data analysis ; Data imputation ; deletion ; Gaussian processes ; Machine learning ; machine-learning methods ; Microwave integrated circuits ; Missing data ; model based procedures ; multiple imputation ; Noise measurement ; Psychology ; single imputation ; Tensile stress ; Water distribution ; water distribution systems ; Water engineering</subject><ispartof>IEEE access, 2018, Vol.6, p.63279-63291</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-192eeb04514284b0ef16e9efe06928b1b44f24852f1102e3a1e5161a947d89a13</citedby><cites>FETCH-LOGICAL-c408t-192eeb04514284b0ef16e9efe06928b1b44f24852f1102e3a1e5161a947d89a13</cites><orcidid>0000-0003-2335-1885 ; 0000-0002-6413-3924</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8502041$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27610,27900,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Osman, Muhammad S.</creatorcontrib><creatorcontrib>Abu-Mahfouz, Adnan M.</creatorcontrib><creatorcontrib>Page, Philip R.</creatorcontrib><title>A Survey on Data Imputation Techniques: Water Distribution System as a Use Case</title><title>IEEE access</title><addtitle>Access</addtitle><description>The presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting the data to using complex algorithms to impute missing data. This paper reviews the different imputation options available from traditional methods (such as deletion and single imputation) to more modern and advanced methods (such as multiple imputation, model-based procedures, and machine learning techniques). The concept, application, and qualitative advantages and disadvantages of these methods are discussed. In addition, a novel approach for selecting an applicable technique is presented. The approach is a "top-down bottom-up" two-prong approach for the selection of a data analysis and missing data technique. The bottom-up approach facilitates the top-down selection of a suitable technique by analyzing the data and narrowing down the selection options. As a use case, this paper also reviews techniques that are used to impute missing data in WDSs.</description><subject>Algorithms</subject><subject>Data analysis</subject><subject>Data imputation</subject><subject>deletion</subject><subject>Gaussian processes</subject><subject>Machine learning</subject><subject>machine-learning methods</subject><subject>Microwave integrated circuits</subject><subject>Missing data</subject><subject>model based procedures</subject><subject>multiple imputation</subject><subject>Noise measurement</subject><subject>Psychology</subject><subject>single imputation</subject><subject>Tensile stress</subject><subject>Water distribution</subject><subject>water distribution systems</subject><subject>Water engineering</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUU1rwkAQDaWFivUXeFnoWbs72U12e5NoW0HwoNLjMtFJG1FjdzcF_32jEelc5vO9meFFUV_woRDcvIyybLJYDIELPQSdppCYu6gDIjGDWMXJ_b_4Mep5v-WN6aak0k40H7FF7X7pxKoDG2NANt0f64ChbPIlrb8P5U9N_pV9YiDHxqUPrszrS3tx8oH2DD1DtvLEMvT0FD0UuPPUu_putHqbLLOPwWz-Ps1Gs8Fach0GwgBRzqUSErTMORUiIUMF8cSAzkUuZQFSKyiaF4FiFKREItDIdKMNirgbTVveTYVbe3TlHt3JVljaS6FyXxZdKNc7sogqJVAqgRxkDDJH0qBjiQQbIJM2XM8t19FV52eD3Va1OzTnW5BKaWMSxZupuJ1au8p7R8Vtq-D2LIRthbBnIexViAbVb1ElEd0QWnHgUsR_za2Bnw</recordid><startdate>2018</startdate><enddate>2018</enddate><creator>Osman, Muhammad S.</creator><creator>Abu-Mahfouz, Adnan M.</creator><creator>Page, Philip R.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0003-2335-1885</orcidid><orcidid>https://orcid.org/0000-0002-6413-3924</orcidid></search><sort><creationdate>2018</creationdate><title>A Survey on Data Imputation Techniques: Water Distribution System as a Use Case</title><author>Osman, Muhammad S. ; Abu-Mahfouz, Adnan M. ; Page, Philip R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-192eeb04514284b0ef16e9efe06928b1b44f24852f1102e3a1e5161a947d89a13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Data analysis</topic><topic>Data imputation</topic><topic>deletion</topic><topic>Gaussian processes</topic><topic>Machine learning</topic><topic>machine-learning methods</topic><topic>Microwave integrated circuits</topic><topic>Missing data</topic><topic>model based procedures</topic><topic>multiple imputation</topic><topic>Noise measurement</topic><topic>Psychology</topic><topic>single imputation</topic><topic>Tensile stress</topic><topic>Water distribution</topic><topic>water distribution systems</topic><topic>Water engineering</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Osman, Muhammad S.</creatorcontrib><creatorcontrib>Abu-Mahfouz, Adnan M.</creatorcontrib><creatorcontrib>Page, Philip R.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Xplore Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Osman, Muhammad S.</au><au>Abu-Mahfouz, Adnan M.</au><au>Page, Philip R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Survey on Data Imputation Techniques: Water Distribution System as a Use Case</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2018</date><risdate>2018</risdate><volume>6</volume><spage>63279</spage><epage>63291</epage><pages>63279-63291</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>The presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting the data to using complex algorithms to impute missing data. This paper reviews the different imputation options available from traditional methods (such as deletion and single imputation) to more modern and advanced methods (such as multiple imputation, model-based procedures, and machine learning techniques). The concept, application, and qualitative advantages and disadvantages of these methods are discussed. In addition, a novel approach for selecting an applicable technique is presented. The approach is a "top-down bottom-up" two-prong approach for the selection of a data analysis and missing data technique. The bottom-up approach facilitates the top-down selection of a suitable technique by analyzing the data and narrowing down the selection options. As a use case, this paper also reviews techniques that are used to impute missing data in WDSs.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2018.2877269</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0003-2335-1885</orcidid><orcidid>https://orcid.org/0000-0002-6413-3924</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2169-3536 |
ispartof | IEEE access, 2018, Vol.6, p.63279-63291 |
issn | 2169-3536 2169-3536 |
language | eng |
recordid | cdi_ieee_primary_8502041 |
source | DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals; IEEE Xplore Open Access Journals |
subjects | Algorithms Data analysis Data imputation deletion Gaussian processes Machine learning machine-learning methods Microwave integrated circuits Missing data model based procedures multiple imputation Noise measurement Psychology single imputation Tensile stress Water distribution water distribution systems Water engineering |
title | A Survey on Data Imputation Techniques: Water Distribution System as a Use Case |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T14%3A35%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Survey%20on%20Data%20Imputation%20Techniques:%20Water%20Distribution%20System%20as%20a%20Use%20Case&rft.jtitle=IEEE%20access&rft.au=Osman,%20Muhammad%20S.&rft.date=2018&rft.volume=6&rft.spage=63279&rft.epage=63291&rft.pages=63279-63291&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2018.2877269&rft_dat=%3Cproquest_ieee_%3E2455899650%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2455899650&rft_id=info:pmid/&rft_ieee_id=8502041&rft_doaj_id=oai_doaj_org_article_aa57e25562b24324bae82834ae2d2e97&rfr_iscdi=true |