Missing data imputation with adversarially-trained graph convolutional networks

Missing data imputation (MDI) is the task of replacing missing values in a dataset with alternative, predicted ones. Because of the widespread presence of missing data, it is a fundamental problem in many scientific disciplines. Popular methods for MDI use global statistics computed from the entire...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Neural networks 2020-09, Vol.129, p.249-260
Hauptverfasser:	Spinelli, Indro, Scardapane, Simone, Uncini, Aurelio
Format:	Artikel
Sprache:	eng
Schlagworte:	Convolutional network Graph data Graph neural network Imputation
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	260
container_issue
container_start_page	249
container_title	Neural networks
container_volume	129
creator	Spinelli, Indro Scardapane, Simone Uncini, Aurelio
description	Missing data imputation (MDI) is the task of replacing missing values in a dataset with alternative, predicted ones. Because of the widespread presence of missing data, it is a fundamental problem in many scientific disciplines. Popular methods for MDI use global statistics computed from the entire dataset (e.g., the feature-wise medians), or build predictive models operating independently on every instance. In this paper we propose a more general framework for MDI, leveraging recent work in the field of graph neural networks (GNNs). We formulate the MDI task in terms of a graph denoising autoencoder, where each edge of the graph encodes the similarity between two patterns. A GNN encoder learns to build intermediate representations for each example by interleaving classical projection layers and locally combining information between neighbors, while another decoding GNN learns to reconstruct the full imputed dataset from this intermediate embedding. In order to speed-up training and improve the performance, we use a combination of multiple losses, including an adversarial loss implemented with the Wasserstein metric and a gradient penalty. We also explore a few extensions to the basic architecture involving the use of residual connections between layers, and of global statistics computed from the dataset to improve the accuracy. On a large experimental evaluation with varying levels of artificial noise, we show that our method is on par or better than several alternative imputation methods. On three datasets with pre-existing missing values, we show that our method is robust to the choice of a downstream classifier, obtaining similar or slightly higher results compared to other choices.
doi_str_mv	10.1016/j.neunet.2020.06.005
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2415288900</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0893608020302185</els_id><sourcerecordid>2415288900</sourcerecordid><originalsourceid>FETCH-LOGICAL-c339t-b312f59b3d2da3902caf68f8930e7fcf5caee4cfb4a3b64a7456169599a492563</originalsourceid><addsrcrecordid>eNp9kD1PwzAQhi0EEqXwDxgysiRc7MSJFyRU8SUVdYHZujiX1iVNiu0U9d-TKsxMtzzve3cPY7cpJCmk8n6bdDR0FBIOHBKQCUB-xmZpWaiYFyU_ZzMolYgllHDJrrzfAoAsMzFjq3frve3WUY0BI7vbDwGD7bvox4ZNhPWBnEdnsW2PcXBoO6qjtcP9JjJ9d-jb4QRjG43bf3r35a_ZRYOtp5u_OWefz08fi9d4uXp5WzwuYyOECnElUt7kqhI1r1Eo4AYbWTbjkUBFY5rcIFFmmipDUckMiyyXqVS5UpgpnksxZ3dT79713wP5oHfWG2pb7KgfvOZZmvOyVAAjmk2ocb33jhq9d3aH7qhT0Cd_eqsnf_rkT4PUo78x9jDFaHzjYMlpbyx1hmrryARd9_b_gl84uH0k</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2415288900</pqid></control><display><type>article</type><title>Missing data imputation with adversarially-trained graph convolutional networks</title><source>Access via ScienceDirect (Elsevier)</source><creator>Spinelli, Indro ; Scardapane, Simone ; Uncini, Aurelio</creator><creatorcontrib>Spinelli, Indro ; Scardapane, Simone ; Uncini, Aurelio</creatorcontrib><description>Missing data imputation (MDI) is the task of replacing missing values in a dataset with alternative, predicted ones. Because of the widespread presence of missing data, it is a fundamental problem in many scientific disciplines. Popular methods for MDI use global statistics computed from the entire dataset (e.g., the feature-wise medians), or build predictive models operating independently on every instance. In this paper we propose a more general framework for MDI, leveraging recent work in the field of graph neural networks (GNNs). We formulate the MDI task in terms of a graph denoising autoencoder, where each edge of the graph encodes the similarity between two patterns. A GNN encoder learns to build intermediate representations for each example by interleaving classical projection layers and locally combining information between neighbors, while another decoding GNN learns to reconstruct the full imputed dataset from this intermediate embedding. In order to speed-up training and improve the performance, we use a combination of multiple losses, including an adversarial loss implemented with the Wasserstein metric and a gradient penalty. We also explore a few extensions to the basic architecture involving the use of residual connections between layers, and of global statistics computed from the dataset to improve the accuracy. On a large experimental evaluation with varying levels of artificial noise, we show that our method is on par or better than several alternative imputation methods. On three datasets with pre-existing missing values, we show that our method is robust to the choice of a downstream classifier, obtaining similar or slightly higher results compared to other choices.</description><identifier>ISSN: 0893-6080</identifier><identifier>EISSN: 1879-2782</identifier><identifier>DOI: 10.1016/j.neunet.2020.06.005</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Convolutional network ; Graph data ; Graph neural network ; Imputation</subject><ispartof>Neural networks, 2020-09, Vol.129, p.249-260</ispartof><rights>2020 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c339t-b312f59b3d2da3902caf68f8930e7fcf5caee4cfb4a3b64a7456169599a492563</citedby><cites>FETCH-LOGICAL-c339t-b312f59b3d2da3902caf68f8930e7fcf5caee4cfb4a3b64a7456169599a492563</cites><orcidid>0000-0003-0881-8344</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.neunet.2020.06.005$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids></links><search><creatorcontrib>Spinelli, Indro</creatorcontrib><creatorcontrib>Scardapane, Simone</creatorcontrib><creatorcontrib>Uncini, Aurelio</creatorcontrib><title>Missing data imputation with adversarially-trained graph convolutional networks</title><title>Neural networks</title><description>Missing data imputation (MDI) is the task of replacing missing values in a dataset with alternative, predicted ones. Because of the widespread presence of missing data, it is a fundamental problem in many scientific disciplines. Popular methods for MDI use global statistics computed from the entire dataset (e.g., the feature-wise medians), or build predictive models operating independently on every instance. In this paper we propose a more general framework for MDI, leveraging recent work in the field of graph neural networks (GNNs). We formulate the MDI task in terms of a graph denoising autoencoder, where each edge of the graph encodes the similarity between two patterns. A GNN encoder learns to build intermediate representations for each example by interleaving classical projection layers and locally combining information between neighbors, while another decoding GNN learns to reconstruct the full imputed dataset from this intermediate embedding. In order to speed-up training and improve the performance, we use a combination of multiple losses, including an adversarial loss implemented with the Wasserstein metric and a gradient penalty. We also explore a few extensions to the basic architecture involving the use of residual connections between layers, and of global statistics computed from the dataset to improve the accuracy. On a large experimental evaluation with varying levels of artificial noise, we show that our method is on par or better than several alternative imputation methods. On three datasets with pre-existing missing values, we show that our method is robust to the choice of a downstream classifier, obtaining similar or slightly higher results compared to other choices.</description><subject>Convolutional network</subject><subject>Graph data</subject><subject>Graph neural network</subject><subject>Imputation</subject><issn>0893-6080</issn><issn>1879-2782</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp9kD1PwzAQhi0EEqXwDxgysiRc7MSJFyRU8SUVdYHZujiX1iVNiu0U9d-TKsxMtzzve3cPY7cpJCmk8n6bdDR0FBIOHBKQCUB-xmZpWaiYFyU_ZzMolYgllHDJrrzfAoAsMzFjq3frve3WUY0BI7vbDwGD7bvox4ZNhPWBnEdnsW2PcXBoO6qjtcP9JjJ9d-jb4QRjG43bf3r35a_ZRYOtp5u_OWefz08fi9d4uXp5WzwuYyOECnElUt7kqhI1r1Eo4AYbWTbjkUBFY5rcIFFmmipDUckMiyyXqVS5UpgpnksxZ3dT79713wP5oHfWG2pb7KgfvOZZmvOyVAAjmk2ocb33jhq9d3aH7qhT0Cd_eqsnf_rkT4PUo78x9jDFaHzjYMlpbyx1hmrryARd9_b_gl84uH0k</recordid><startdate>202009</startdate><enddate>202009</enddate><creator>Spinelli, Indro</creator><creator>Scardapane, Simone</creator><creator>Uncini, Aurelio</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-0881-8344</orcidid></search><sort><creationdate>202009</creationdate><title>Missing data imputation with adversarially-trained graph convolutional networks</title><author>Spinelli, Indro ; Scardapane, Simone ; Uncini, Aurelio</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c339t-b312f59b3d2da3902caf68f8930e7fcf5caee4cfb4a3b64a7456169599a492563</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Convolutional network</topic><topic>Graph data</topic><topic>Graph neural network</topic><topic>Imputation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Spinelli, Indro</creatorcontrib><creatorcontrib>Scardapane, Simone</creatorcontrib><creatorcontrib>Uncini, Aurelio</creatorcontrib><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Neural networks</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Spinelli, Indro</au><au>Scardapane, Simone</au><au>Uncini, Aurelio</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Missing data imputation with adversarially-trained graph convolutional networks</atitle><jtitle>Neural networks</jtitle><date>2020-09</date><risdate>2020</risdate><volume>129</volume><spage>249</spage><epage>260</epage><pages>249-260</pages><issn>0893-6080</issn><eissn>1879-2782</eissn><abstract>Missing data imputation (MDI) is the task of replacing missing values in a dataset with alternative, predicted ones. Because of the widespread presence of missing data, it is a fundamental problem in many scientific disciplines. Popular methods for MDI use global statistics computed from the entire dataset (e.g., the feature-wise medians), or build predictive models operating independently on every instance. In this paper we propose a more general framework for MDI, leveraging recent work in the field of graph neural networks (GNNs). We formulate the MDI task in terms of a graph denoising autoencoder, where each edge of the graph encodes the similarity between two patterns. A GNN encoder learns to build intermediate representations for each example by interleaving classical projection layers and locally combining information between neighbors, while another decoding GNN learns to reconstruct the full imputed dataset from this intermediate embedding. In order to speed-up training and improve the performance, we use a combination of multiple losses, including an adversarial loss implemented with the Wasserstein metric and a gradient penalty. We also explore a few extensions to the basic architecture involving the use of residual connections between layers, and of global statistics computed from the dataset to improve the accuracy. On a large experimental evaluation with varying levels of artificial noise, we show that our method is on par or better than several alternative imputation methods. On three datasets with pre-existing missing values, we show that our method is robust to the choice of a downstream classifier, obtaining similar or slightly higher results compared to other choices.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.neunet.2020.06.005</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0003-0881-8344</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0893-6080
ispartof	Neural networks, 2020-09, Vol.129, p.249-260
issn	0893-6080 1879-2782
language	eng
recordid	cdi_proquest_miscellaneous_2415288900
source	Access via ScienceDirect (Elsevier)
subjects	Convolutional network Graph data Graph neural network Imputation
title	Missing data imputation with adversarially-trained graph convolutional networks
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-23T05%3A53%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Missing%20data%20imputation%20with%20adversarially-trained%20graph%20convolutional%20networks&rft.jtitle=Neural%20networks&rft.au=Spinelli,%20Indro&rft.date=2020-09&rft.volume=129&rft.spage=249&rft.epage=260&rft.pages=249-260&rft.issn=0893-6080&rft.eissn=1879-2782&rft_id=info:doi/10.1016/j.neunet.2020.06.005&rft_dat=%3Cproquest_cross%3E2415288900%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2415288900&rft_id=info:pmid/&rft_els_id=S0893608020302185&rfr_iscdi=true