Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data

During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framew...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on emerging topics in computational intelligence 2024-12, Vol.8 (6), p.3767-3778
Hauptverfasser:	Hssayeni, Murtadha D., Ghoraani, Behnaz
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial neural networks Convolutional neural networks Data integrity Data models Datasets Deep regression modeling Extrapolation Feature extraction Generative adversarial networks Generators imbalanced and incomplete data Parkinson's disease Regression analysis Regression models Testing Time series Time series analysis time-series data Training
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	3778
container_issue	6
container_start_page	3767
container_title	IEEE transactions on emerging topics in computational intelligence
container_volume	8
creator	Hssayeni, Murtadha D. Ghoraani, Behnaz
description	During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framework of Conditional Generative Adversarial Networks (cGANs) for improving deep regression models for time-series data with an imbalanced and incomplete distribution. First, we investigated the potential of using a vanilla cGAN as a data imputation to improve the generalizability of the developed models to unseen data in such datasets. Next, we proposed a modified cGAN architecture with improved extrapolation and generalizability of the regression models. Our investigations used an imbalanced synthetic non-stationary dataset, a real-world dataset in Parkinson's disease (PD) application domain, and one publicly-available dataset for Negative Affect (NA) estimation. We found that vanilla cGAN failed to generate realistic time-series data due to severe mode collapse, limiting its application as a data imputation for imbalanced and incomplete data. Importantly, the proposed cGAN framework significantly improved extrapolation and generalizability for the prediction of regression scores with an average improvement of 56%, 34%, and 18%, respectively, in mean absolute error for the synthetic, PD, and NA datasets when compared with traditional Convolutional Neural Networks. The codes are publicly available on Github.
doi_str_mv	10.1109/TETCI.2024.3372435
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3131908814</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10475374</ieee_id><sourcerecordid>3131908814</sourcerecordid><originalsourceid>FETCH-LOGICAL-c247t-36fd4407369493f79599afed50438ef499381c13fe9459f80326b1d09e49908e3</originalsourceid><addsrcrecordid>eNpNkE9LAzEQxYMoWGq_gHgIeN6aZJLu5iit1oWKoBW8hXR3UrbsP5PtwW_f1PbQ0zyY92YeP0LuOZtyzvTT-mU9z6eCCTkFSIUEdUVGQqY8EZn6ub7Qt2QSwo4xJrTioOSILBeIPf3ErccQqq6l712JddVuqes8zZuNrW1bYEltW9K8Lbqmr3FAuq4aTL7QVxjowg72jtw4WwecnOeYfL_GWm_J6mOZz59XSRFLDAnMXCklS2GmpQaXaqW1dVgqJiFDJ7WGjBccHGqptMsYiNmGl0xjXLEMYUweT3d73_3uMQxm1-19G18a4MCjJ-MyusTJVfguBI_O9L5qrP8znJkjM_PPzByZmTOzGHo4hSpEvAjIVEEq4QDT7mXq</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3131908814</pqid></control><display><type>article</type><title>Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data</title><source>IEEE Electronic Library (IEL)</source><creator>Hssayeni, Murtadha D. ; Ghoraani, Behnaz</creator><creatorcontrib>Hssayeni, Murtadha D. ; Ghoraani, Behnaz</creatorcontrib><description>During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framework of Conditional Generative Adversarial Networks (cGANs) for improving deep regression models for time-series data with an imbalanced and incomplete distribution. First, we investigated the potential of using a vanilla cGAN as a data imputation to improve the generalizability of the developed models to unseen data in such datasets. Next, we proposed a modified cGAN architecture with improved extrapolation and generalizability of the regression models. Our investigations used an imbalanced synthetic non-stationary dataset, a real-world dataset in Parkinson's disease (PD) application domain, and one publicly-available dataset for Negative Affect (NA) estimation. We found that vanilla cGAN failed to generate realistic time-series data due to severe mode collapse, limiting its application as a data imputation for imbalanced and incomplete data. Importantly, the proposed cGAN framework significantly improved extrapolation and generalizability for the prediction of regression scores with an average improvement of 56%, 34%, and 18%, respectively, in mean absolute error for the synthetic, PD, and NA datasets when compared with traditional Convolutional Neural Networks. The codes are publicly available on Github.</description><identifier>ISSN: 2471-285X</identifier><identifier>EISSN: 2471-285X</identifier><identifier>DOI: 10.1109/TETCI.2024.3372435</identifier><identifier>CODEN: ITETCU</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Artificial neural networks ; Convolutional neural networks ; Data integrity ; Data models ; Datasets ; Deep regression modeling ; Extrapolation ; Feature extraction ; Generative adversarial networks ; Generators ; imbalanced and incomplete data ; Parkinson's disease ; Regression analysis ; Regression models ; Testing ; Time series ; Time series analysis ; time-series data ; Training</subject><ispartof>IEEE transactions on emerging topics in computational intelligence, 2024-12, Vol.8 (6), p.3767-3778</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c247t-36fd4407369493f79599afed50438ef499381c13fe9459f80326b1d09e49908e3</cites><orcidid>0000-0003-0075-7663 ; 0000-0002-8588-4639</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10475374$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10475374$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hssayeni, Murtadha D.</creatorcontrib><creatorcontrib>Ghoraani, Behnaz</creatorcontrib><title>Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data</title><title>IEEE transactions on emerging topics in computational intelligence</title><addtitle>TETCI</addtitle><description>During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framework of Conditional Generative Adversarial Networks (cGANs) for improving deep regression models for time-series data with an imbalanced and incomplete distribution. First, we investigated the potential of using a vanilla cGAN as a data imputation to improve the generalizability of the developed models to unseen data in such datasets. Next, we proposed a modified cGAN architecture with improved extrapolation and generalizability of the regression models. Our investigations used an imbalanced synthetic non-stationary dataset, a real-world dataset in Parkinson's disease (PD) application domain, and one publicly-available dataset for Negative Affect (NA) estimation. We found that vanilla cGAN failed to generate realistic time-series data due to severe mode collapse, limiting its application as a data imputation for imbalanced and incomplete data. Importantly, the proposed cGAN framework significantly improved extrapolation and generalizability for the prediction of regression scores with an average improvement of 56%, 34%, and 18%, respectively, in mean absolute error for the synthetic, PD, and NA datasets when compared with traditional Convolutional Neural Networks. The codes are publicly available on Github.</description><subject>Artificial neural networks</subject><subject>Convolutional neural networks</subject><subject>Data integrity</subject><subject>Data models</subject><subject>Datasets</subject><subject>Deep regression modeling</subject><subject>Extrapolation</subject><subject>Feature extraction</subject><subject>Generative adversarial networks</subject><subject>Generators</subject><subject>imbalanced and incomplete data</subject><subject>Parkinson's disease</subject><subject>Regression analysis</subject><subject>Regression models</subject><subject>Testing</subject><subject>Time series</subject><subject>Time series analysis</subject><subject>time-series data</subject><subject>Training</subject><issn>2471-285X</issn><issn>2471-285X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE9LAzEQxYMoWGq_gHgIeN6aZJLu5iit1oWKoBW8hXR3UrbsP5PtwW_f1PbQ0zyY92YeP0LuOZtyzvTT-mU9z6eCCTkFSIUEdUVGQqY8EZn6ub7Qt2QSwo4xJrTioOSILBeIPf3ErccQqq6l712JddVuqes8zZuNrW1bYEltW9K8Lbqmr3FAuq4aTL7QVxjowg72jtw4WwecnOeYfL_GWm_J6mOZz59XSRFLDAnMXCklS2GmpQaXaqW1dVgqJiFDJ7WGjBccHGqptMsYiNmGl0xjXLEMYUweT3d73_3uMQxm1-19G18a4MCjJ-MyusTJVfguBI_O9L5qrP8znJkjM_PPzByZmTOzGHo4hSpEvAjIVEEq4QDT7mXq</recordid><startdate>20241201</startdate><enddate>20241201</enddate><creator>Hssayeni, Murtadha D.</creator><creator>Ghoraani, Behnaz</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0003-0075-7663</orcidid><orcidid>https://orcid.org/0000-0002-8588-4639</orcidid></search><sort><creationdate>20241201</creationdate><title>Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data</title><author>Hssayeni, Murtadha D. ; Ghoraani, Behnaz</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c247t-36fd4407369493f79599afed50438ef499381c13fe9459f80326b1d09e49908e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial neural networks</topic><topic>Convolutional neural networks</topic><topic>Data integrity</topic><topic>Data models</topic><topic>Datasets</topic><topic>Deep regression modeling</topic><topic>Extrapolation</topic><topic>Feature extraction</topic><topic>Generative adversarial networks</topic><topic>Generators</topic><topic>imbalanced and incomplete data</topic><topic>Parkinson's disease</topic><topic>Regression analysis</topic><topic>Regression models</topic><topic>Testing</topic><topic>Time series</topic><topic>Time series analysis</topic><topic>time-series data</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hssayeni, Murtadha D.</creatorcontrib><creatorcontrib>Ghoraani, Behnaz</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on emerging topics in computational intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hssayeni, Murtadha D.</au><au>Ghoraani, Behnaz</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data</atitle><jtitle>IEEE transactions on emerging topics in computational intelligence</jtitle><stitle>TETCI</stitle><date>2024-12-01</date><risdate>2024</risdate><volume>8</volume><issue>6</issue><spage>3767</spage><epage>3778</epage><pages>3767-3778</pages><issn>2471-285X</issn><eissn>2471-285X</eissn><coden>ITETCU</coden><abstract>During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framework of Conditional Generative Adversarial Networks (cGANs) for improving deep regression models for time-series data with an imbalanced and incomplete distribution. First, we investigated the potential of using a vanilla cGAN as a data imputation to improve the generalizability of the developed models to unseen data in such datasets. Next, we proposed a modified cGAN architecture with improved extrapolation and generalizability of the regression models. Our investigations used an imbalanced synthetic non-stationary dataset, a real-world dataset in Parkinson's disease (PD) application domain, and one publicly-available dataset for Negative Affect (NA) estimation. We found that vanilla cGAN failed to generate realistic time-series data due to severe mode collapse, limiting its application as a data imputation for imbalanced and incomplete data. Importantly, the proposed cGAN framework significantly improved extrapolation and generalizability for the prediction of regression scores with an average improvement of 56%, 34%, and 18%, respectively, in mean absolute error for the synthetic, PD, and NA datasets when compared with traditional Convolutional Neural Networks. The codes are publicly available on Github.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TETCI.2024.3372435</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0003-0075-7663</orcidid><orcidid>https://orcid.org/0000-0002-8588-4639</orcidid></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 2471-285X
ispartof	IEEE transactions on emerging topics in computational intelligence, 2024-12, Vol.8 (6), p.3767-3778
issn	2471-285X 2471-285X
language	eng
recordid	cdi_proquest_journals_3131908814
source	IEEE Electronic Library (IEL)
subjects	Artificial neural networks Convolutional neural networks Data integrity Data models Datasets Deep regression modeling Extrapolation Feature extraction Generative adversarial networks Generators imbalanced and incomplete data Parkinson's disease Regression analysis Regression models Testing Time series Time series analysis time-series data Training
title	Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T14%3A33%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Regression%20Modeling%20for%20Imbalanced%20and%20Incomplete%20Time-Series%20Data&rft.jtitle=IEEE%20transactions%20on%20emerging%20topics%20in%20computational%20intelligence&rft.au=Hssayeni,%20Murtadha%20D.&rft.date=2024-12-01&rft.volume=8&rft.issue=6&rft.spage=3767&rft.epage=3778&rft.pages=3767-3778&rft.issn=2471-285X&rft.eissn=2471-285X&rft.coden=ITETCU&rft_id=info:doi/10.1109/TETCI.2024.3372435&rft_dat=%3Cproquest_RIE%3E3131908814%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3131908814&rft_id=info:pmid/&rft_ieee_id=10475374&rfr_iscdi=true