A Survey on Data Imputation Techniques: Water Distribution System as a Use Case

The presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE access 2018, Vol.6, p.63279-63291
Hauptverfasser: Osman, Muhammad S., Abu-Mahfouz, Adnan M., Page, Philip R.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 63291
container_issue
container_start_page 63279
container_title IEEE access
container_volume 6
creator Osman, Muhammad S.
Abu-Mahfouz, Adnan M.
Page, Philip R.
description The presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting the data to using complex algorithms to impute missing data. This paper reviews the different imputation options available from traditional methods (such as deletion and single imputation) to more modern and advanced methods (such as multiple imputation, model-based procedures, and machine learning techniques). The concept, application, and qualitative advantages and disadvantages of these methods are discussed. In addition, a novel approach for selecting an applicable technique is presented. The approach is a "top-down bottom-up" two-prong approach for the selection of a data analysis and missing data technique. The bottom-up approach facilitates the top-down selection of a suitable technique by analyzing the data and narrowing down the selection options. As a use case, this paper also reviews techniques that are used to impute missing data in WDSs.
doi_str_mv 10.1109/ACCESS.2018.2877269
format Article
fullrecord <record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_8502041</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8502041</ieee_id><doaj_id>oai_doaj_org_article_aa57e25562b24324bae82834ae2d2e97</doaj_id><sourcerecordid>2455899650</sourcerecordid><originalsourceid>FETCH-LOGICAL-c408t-192eeb04514284b0ef16e9efe06928b1b44f24852f1102e3a1e5161a947d89a13</originalsourceid><addsrcrecordid>eNpNUU1rwkAQDaWFivUXeFnoWbs72U12e5NoW0HwoNLjMtFJG1FjdzcF_32jEelc5vO9meFFUV_woRDcvIyybLJYDIELPQSdppCYu6gDIjGDWMXJ_b_4Mep5v-WN6aak0k40H7FF7X7pxKoDG2NANt0f64ChbPIlrb8P5U9N_pV9YiDHxqUPrszrS3tx8oH2DD1DtvLEMvT0FD0UuPPUu_putHqbLLOPwWz-Ps1Gs8Fach0GwgBRzqUSErTMORUiIUMF8cSAzkUuZQFSKyiaF4FiFKREItDIdKMNirgbTVveTYVbe3TlHt3JVljaS6FyXxZdKNc7sogqJVAqgRxkDDJH0qBjiQQbIJM2XM8t19FV52eD3Va1OzTnW5BKaWMSxZupuJ1au8p7R8Vtq-D2LIRthbBnIexViAbVb1ElEd0QWnHgUsR_za2Bnw</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2455899650</pqid></control><display><type>article</type><title>A Survey on Data Imputation Techniques: Water Distribution System as a Use Case</title><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>IEEE Xplore Open Access Journals</source><creator>Osman, Muhammad S. ; Abu-Mahfouz, Adnan M. ; Page, Philip R.</creator><creatorcontrib>Osman, Muhammad S. ; Abu-Mahfouz, Adnan M. ; Page, Philip R.</creatorcontrib><description>The presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting the data to using complex algorithms to impute missing data. This paper reviews the different imputation options available from traditional methods (such as deletion and single imputation) to more modern and advanced methods (such as multiple imputation, model-based procedures, and machine learning techniques). The concept, application, and qualitative advantages and disadvantages of these methods are discussed. In addition, a novel approach for selecting an applicable technique is presented. The approach is a "top-down bottom-up" two-prong approach for the selection of a data analysis and missing data technique. The bottom-up approach facilitates the top-down selection of a suitable technique by analyzing the data and narrowing down the selection options. As a use case, this paper also reviews techniques that are used to impute missing data in WDSs.</description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2018.2877269</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Algorithms ; Data analysis ; Data imputation ; deletion ; Gaussian processes ; Machine learning ; machine-learning methods ; Microwave integrated circuits ; Missing data ; model based procedures ; multiple imputation ; Noise measurement ; Psychology ; single imputation ; Tensile stress ; Water distribution ; water distribution systems ; Water engineering</subject><ispartof>IEEE access, 2018, Vol.6, p.63279-63291</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2018</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c408t-192eeb04514284b0ef16e9efe06928b1b44f24852f1102e3a1e5161a947d89a13</citedby><cites>FETCH-LOGICAL-c408t-192eeb04514284b0ef16e9efe06928b1b44f24852f1102e3a1e5161a947d89a13</cites><orcidid>0000-0003-2335-1885 ; 0000-0002-6413-3924</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8502041$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,860,2096,4010,27610,27900,27901,27902,54908</link.rule.ids></links><search><creatorcontrib>Osman, Muhammad S.</creatorcontrib><creatorcontrib>Abu-Mahfouz, Adnan M.</creatorcontrib><creatorcontrib>Page, Philip R.</creatorcontrib><title>A Survey on Data Imputation Techniques: Water Distribution System as a Use Case</title><title>IEEE access</title><addtitle>Access</addtitle><description>The presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting the data to using complex algorithms to impute missing data. This paper reviews the different imputation options available from traditional methods (such as deletion and single imputation) to more modern and advanced methods (such as multiple imputation, model-based procedures, and machine learning techniques). The concept, application, and qualitative advantages and disadvantages of these methods are discussed. In addition, a novel approach for selecting an applicable technique is presented. The approach is a "top-down bottom-up" two-prong approach for the selection of a data analysis and missing data technique. The bottom-up approach facilitates the top-down selection of a suitable technique by analyzing the data and narrowing down the selection options. As a use case, this paper also reviews techniques that are used to impute missing data in WDSs.</description><subject>Algorithms</subject><subject>Data analysis</subject><subject>Data imputation</subject><subject>deletion</subject><subject>Gaussian processes</subject><subject>Machine learning</subject><subject>machine-learning methods</subject><subject>Microwave integrated circuits</subject><subject>Missing data</subject><subject>model based procedures</subject><subject>multiple imputation</subject><subject>Noise measurement</subject><subject>Psychology</subject><subject>single imputation</subject><subject>Tensile stress</subject><subject>Water distribution</subject><subject>water distribution systems</subject><subject>Water engineering</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>RIE</sourceid><sourceid>DOA</sourceid><recordid>eNpNUU1rwkAQDaWFivUXeFnoWbs72U12e5NoW0HwoNLjMtFJG1FjdzcF_32jEelc5vO9meFFUV_woRDcvIyybLJYDIELPQSdppCYu6gDIjGDWMXJ_b_4Mep5v-WN6aak0k40H7FF7X7pxKoDG2NANt0f64ChbPIlrb8P5U9N_pV9YiDHxqUPrszrS3tx8oH2DD1DtvLEMvT0FD0UuPPUu_putHqbLLOPwWz-Ps1Gs8Fach0GwgBRzqUSErTMORUiIUMF8cSAzkUuZQFSKyiaF4FiFKREItDIdKMNirgbTVveTYVbe3TlHt3JVljaS6FyXxZdKNc7sogqJVAqgRxkDDJH0qBjiQQbIJM2XM8t19FV52eD3Va1OzTnW5BKaWMSxZupuJ1au8p7R8Vtq-D2LIRthbBnIexViAbVb1ElEd0QWnHgUsR_za2Bnw</recordid><startdate>2018</startdate><enddate>2018</enddate><creator>Osman, Muhammad S.</creator><creator>Abu-Mahfouz, Adnan M.</creator><creator>Page, Philip R.</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>7SR</scope><scope>8BQ</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0003-2335-1885</orcidid><orcidid>https://orcid.org/0000-0002-6413-3924</orcidid></search><sort><creationdate>2018</creationdate><title>A Survey on Data Imputation Techniques: Water Distribution System as a Use Case</title><author>Osman, Muhammad S. ; Abu-Mahfouz, Adnan M. ; Page, Philip R.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c408t-192eeb04514284b0ef16e9efe06928b1b44f24852f1102e3a1e5161a947d89a13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Data analysis</topic><topic>Data imputation</topic><topic>deletion</topic><topic>Gaussian processes</topic><topic>Machine learning</topic><topic>machine-learning methods</topic><topic>Microwave integrated circuits</topic><topic>Missing data</topic><topic>model based procedures</topic><topic>multiple imputation</topic><topic>Noise measurement</topic><topic>Psychology</topic><topic>single imputation</topic><topic>Tensile stress</topic><topic>Water distribution</topic><topic>water distribution systems</topic><topic>Water engineering</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Osman, Muhammad S.</creatorcontrib><creatorcontrib>Abu-Mahfouz, Adnan M.</creatorcontrib><creatorcontrib>Page, Philip R.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE Xplore Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Osman, Muhammad S.</au><au>Abu-Mahfouz, Adnan M.</au><au>Page, Philip R.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Survey on Data Imputation Techniques: Water Distribution System as a Use Case</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2018</date><risdate>2018</risdate><volume>6</volume><spage>63279</spage><epage>63291</epage><pages>63279-63291</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract>The presence of missing data is problematic in most quantitative research studies. Water distribution systems (WDSs) are not immune to this problem. In fact, missing data is an inherent feature of a WDS. There are various techniques and methods to address missing data ranging from simply deleting the data to using complex algorithms to impute missing data. This paper reviews the different imputation options available from traditional methods (such as deletion and single imputation) to more modern and advanced methods (such as multiple imputation, model-based procedures, and machine learning techniques). The concept, application, and qualitative advantages and disadvantages of these methods are discussed. In addition, a novel approach for selecting an applicable technique is presented. The approach is a "top-down bottom-up" two-prong approach for the selection of a data analysis and missing data technique. The bottom-up approach facilitates the top-down selection of a suitable technique by analyzing the data and narrowing down the selection options. As a use case, this paper also reviews techniques that are used to impute missing data in WDSs.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/ACCESS.2018.2877269</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0003-2335-1885</orcidid><orcidid>https://orcid.org/0000-0002-6413-3924</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2018, Vol.6, p.63279-63291
issn 2169-3536
2169-3536
language eng
recordid cdi_ieee_primary_8502041
source DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals; IEEE Xplore Open Access Journals
subjects Algorithms
Data analysis
Data imputation
deletion
Gaussian processes
Machine learning
machine-learning methods
Microwave integrated circuits
Missing data
model based procedures
multiple imputation
Noise measurement
Psychology
single imputation
Tensile stress
Water distribution
water distribution systems
Water engineering
title A Survey on Data Imputation Techniques: Water Distribution System as a Use Case
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T14%3A35%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Survey%20on%20Data%20Imputation%20Techniques:%20Water%20Distribution%20System%20as%20a%20Use%20Case&rft.jtitle=IEEE%20access&rft.au=Osman,%20Muhammad%20S.&rft.date=2018&rft.volume=6&rft.spage=63279&rft.epage=63291&rft.pages=63279-63291&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2018.2877269&rft_dat=%3Cproquest_ieee_%3E2455899650%3C/proquest_ieee_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2455899650&rft_id=info:pmid/&rft_ieee_id=8502041&rft_doaj_id=oai_doaj_org_article_aa57e25562b24324bae82834ae2d2e97&rfr_iscdi=true