Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data

During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framew...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on emerging topics in computational intelligence 2024-12, Vol.8 (6), p.3767-3778
Hauptverfasser: Hssayeni, Murtadha D., Ghoraani, Behnaz
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 3778
container_issue 6
container_start_page 3767
container_title IEEE transactions on emerging topics in computational intelligence
container_volume 8
creator Hssayeni, Murtadha D.
Ghoraani, Behnaz
description During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framework of Conditional Generative Adversarial Networks (cGANs) for improving deep regression models for time-series data with an imbalanced and incomplete distribution. First, we investigated the potential of using a vanilla cGAN as a data imputation to improve the generalizability of the developed models to unseen data in such datasets. Next, we proposed a modified cGAN architecture with improved extrapolation and generalizability of the regression models. Our investigations used an imbalanced synthetic non-stationary dataset, a real-world dataset in Parkinson's disease (PD) application domain, and one publicly-available dataset for Negative Affect (NA) estimation. We found that vanilla cGAN failed to generate realistic time-series data due to severe mode collapse, limiting its application as a data imputation for imbalanced and incomplete data. Importantly, the proposed cGAN framework significantly improved extrapolation and generalizability for the prediction of regression scores with an average improvement of 56%, 34%, and 18%, respectively, in mean absolute error for the synthetic, PD, and NA datasets when compared with traditional Convolutional Neural Networks. The codes are publicly available on Github.
doi_str_mv 10.1109/TETCI.2024.3372435
format Article
fullrecord <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3131908814</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10475374</ieee_id><sourcerecordid>3131908814</sourcerecordid><originalsourceid>FETCH-LOGICAL-c247t-36fd4407369493f79599afed50438ef499381c13fe9459f80326b1d09e49908e3</originalsourceid><addsrcrecordid>eNpNkE9LAzEQxYMoWGq_gHgIeN6aZJLu5iit1oWKoBW8hXR3UrbsP5PtwW_f1PbQ0zyY92YeP0LuOZtyzvTT-mU9z6eCCTkFSIUEdUVGQqY8EZn6ub7Qt2QSwo4xJrTioOSILBeIPf3ErccQqq6l712JddVuqes8zZuNrW1bYEltW9K8Lbqmr3FAuq4aTL7QVxjowg72jtw4WwecnOeYfL_GWm_J6mOZz59XSRFLDAnMXCklS2GmpQaXaqW1dVgqJiFDJ7WGjBccHGqptMsYiNmGl0xjXLEMYUweT3d73_3uMQxm1-19G18a4MCjJ-MyusTJVfguBI_O9L5qrP8znJkjM_PPzByZmTOzGHo4hSpEvAjIVEEq4QDT7mXq</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3131908814</pqid></control><display><type>article</type><title>Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data</title><source>IEEE Electronic Library (IEL)</source><creator>Hssayeni, Murtadha D. ; Ghoraani, Behnaz</creator><creatorcontrib>Hssayeni, Murtadha D. ; Ghoraani, Behnaz</creatorcontrib><description>During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framework of Conditional Generative Adversarial Networks (cGANs) for improving deep regression models for time-series data with an imbalanced and incomplete distribution. First, we investigated the potential of using a vanilla cGAN as a data imputation to improve the generalizability of the developed models to unseen data in such datasets. Next, we proposed a modified cGAN architecture with improved extrapolation and generalizability of the regression models. Our investigations used an imbalanced synthetic non-stationary dataset, a real-world dataset in Parkinson's disease (PD) application domain, and one publicly-available dataset for Negative Affect (NA) estimation. We found that vanilla cGAN failed to generate realistic time-series data due to severe mode collapse, limiting its application as a data imputation for imbalanced and incomplete data. Importantly, the proposed cGAN framework significantly improved extrapolation and generalizability for the prediction of regression scores with an average improvement of 56%, 34%, and 18%, respectively, in mean absolute error for the synthetic, PD, and NA datasets when compared with traditional Convolutional Neural Networks. The codes are publicly available on Github.</description><identifier>ISSN: 2471-285X</identifier><identifier>EISSN: 2471-285X</identifier><identifier>DOI: 10.1109/TETCI.2024.3372435</identifier><identifier>CODEN: ITETCU</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Artificial neural networks ; Convolutional neural networks ; Data integrity ; Data models ; Datasets ; Deep regression modeling ; Extrapolation ; Feature extraction ; Generative adversarial networks ; Generators ; imbalanced and incomplete data ; Parkinson's disease ; Regression analysis ; Regression models ; Testing ; Time series ; Time series analysis ; time-series data ; Training</subject><ispartof>IEEE transactions on emerging topics in computational intelligence, 2024-12, Vol.8 (6), p.3767-3778</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c247t-36fd4407369493f79599afed50438ef499381c13fe9459f80326b1d09e49908e3</cites><orcidid>0000-0003-0075-7663 ; 0000-0002-8588-4639</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10475374$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10475374$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hssayeni, Murtadha D.</creatorcontrib><creatorcontrib>Ghoraani, Behnaz</creatorcontrib><title>Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data</title><title>IEEE transactions on emerging topics in computational intelligence</title><addtitle>TETCI</addtitle><description>During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framework of Conditional Generative Adversarial Networks (cGANs) for improving deep regression models for time-series data with an imbalanced and incomplete distribution. First, we investigated the potential of using a vanilla cGAN as a data imputation to improve the generalizability of the developed models to unseen data in such datasets. Next, we proposed a modified cGAN architecture with improved extrapolation and generalizability of the regression models. Our investigations used an imbalanced synthetic non-stationary dataset, a real-world dataset in Parkinson's disease (PD) application domain, and one publicly-available dataset for Negative Affect (NA) estimation. We found that vanilla cGAN failed to generate realistic time-series data due to severe mode collapse, limiting its application as a data imputation for imbalanced and incomplete data. Importantly, the proposed cGAN framework significantly improved extrapolation and generalizability for the prediction of regression scores with an average improvement of 56%, 34%, and 18%, respectively, in mean absolute error for the synthetic, PD, and NA datasets when compared with traditional Convolutional Neural Networks. The codes are publicly available on Github.</description><subject>Artificial neural networks</subject><subject>Convolutional neural networks</subject><subject>Data integrity</subject><subject>Data models</subject><subject>Datasets</subject><subject>Deep regression modeling</subject><subject>Extrapolation</subject><subject>Feature extraction</subject><subject>Generative adversarial networks</subject><subject>Generators</subject><subject>imbalanced and incomplete data</subject><subject>Parkinson's disease</subject><subject>Regression analysis</subject><subject>Regression models</subject><subject>Testing</subject><subject>Time series</subject><subject>Time series analysis</subject><subject>time-series data</subject><subject>Training</subject><issn>2471-285X</issn><issn>2471-285X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE9LAzEQxYMoWGq_gHgIeN6aZJLu5iit1oWKoBW8hXR3UrbsP5PtwW_f1PbQ0zyY92YeP0LuOZtyzvTT-mU9z6eCCTkFSIUEdUVGQqY8EZn6ub7Qt2QSwo4xJrTioOSILBeIPf3ErccQqq6l712JddVuqes8zZuNrW1bYEltW9K8Lbqmr3FAuq4aTL7QVxjowg72jtw4WwecnOeYfL_GWm_J6mOZz59XSRFLDAnMXCklS2GmpQaXaqW1dVgqJiFDJ7WGjBccHGqptMsYiNmGl0xjXLEMYUweT3d73_3uMQxm1-19G18a4MCjJ-MyusTJVfguBI_O9L5qrP8znJkjM_PPzByZmTOzGHo4hSpEvAjIVEEq4QDT7mXq</recordid><startdate>20241201</startdate><enddate>20241201</enddate><creator>Hssayeni, Murtadha D.</creator><creator>Ghoraani, Behnaz</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0003-0075-7663</orcidid><orcidid>https://orcid.org/0000-0002-8588-4639</orcidid></search><sort><creationdate>20241201</creationdate><title>Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data</title><author>Hssayeni, Murtadha D. ; Ghoraani, Behnaz</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c247t-36fd4407369493f79599afed50438ef499381c13fe9459f80326b1d09e49908e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial neural networks</topic><topic>Convolutional neural networks</topic><topic>Data integrity</topic><topic>Data models</topic><topic>Datasets</topic><topic>Deep regression modeling</topic><topic>Extrapolation</topic><topic>Feature extraction</topic><topic>Generative adversarial networks</topic><topic>Generators</topic><topic>imbalanced and incomplete data</topic><topic>Parkinson's disease</topic><topic>Regression analysis</topic><topic>Regression models</topic><topic>Testing</topic><topic>Time series</topic><topic>Time series analysis</topic><topic>time-series data</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hssayeni, Murtadha D.</creatorcontrib><creatorcontrib>Ghoraani, Behnaz</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on emerging topics in computational intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hssayeni, Murtadha D.</au><au>Ghoraani, Behnaz</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data</atitle><jtitle>IEEE transactions on emerging topics in computational intelligence</jtitle><stitle>TETCI</stitle><date>2024-12-01</date><risdate>2024</risdate><volume>8</volume><issue>6</issue><spage>3767</spage><epage>3778</epage><pages>3767-3778</pages><issn>2471-285X</issn><eissn>2471-285X</eissn><coden>ITETCU</coden><abstract>During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framework of Conditional Generative Adversarial Networks (cGANs) for improving deep regression models for time-series data with an imbalanced and incomplete distribution. First, we investigated the potential of using a vanilla cGAN as a data imputation to improve the generalizability of the developed models to unseen data in such datasets. Next, we proposed a modified cGAN architecture with improved extrapolation and generalizability of the regression models. Our investigations used an imbalanced synthetic non-stationary dataset, a real-world dataset in Parkinson's disease (PD) application domain, and one publicly-available dataset for Negative Affect (NA) estimation. We found that vanilla cGAN failed to generate realistic time-series data due to severe mode collapse, limiting its application as a data imputation for imbalanced and incomplete data. Importantly, the proposed cGAN framework significantly improved extrapolation and generalizability for the prediction of regression scores with an average improvement of 56%, 34%, and 18%, respectively, in mean absolute error for the synthetic, PD, and NA datasets when compared with traditional Convolutional Neural Networks. The codes are publicly available on Github.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TETCI.2024.3372435</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0003-0075-7663</orcidid><orcidid>https://orcid.org/0000-0002-8588-4639</orcidid></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 2471-285X
ispartof IEEE transactions on emerging topics in computational intelligence, 2024-12, Vol.8 (6), p.3767-3778
issn 2471-285X
2471-285X
language eng
recordid cdi_proquest_journals_3131908814
source IEEE Electronic Library (IEL)
subjects Artificial neural networks
Convolutional neural networks
Data integrity
Data models
Datasets
Deep regression modeling
Extrapolation
Feature extraction
Generative adversarial networks
Generators
imbalanced and incomplete data
Parkinson's disease
Regression analysis
Regression models
Testing
Time series
Time series analysis
time-series data
Training
title Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T14%3A33%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Regression%20Modeling%20for%20Imbalanced%20and%20Incomplete%20Time-Series%20Data&rft.jtitle=IEEE%20transactions%20on%20emerging%20topics%20in%20computational%20intelligence&rft.au=Hssayeni,%20Murtadha%20D.&rft.date=2024-12-01&rft.volume=8&rft.issue=6&rft.spage=3767&rft.epage=3778&rft.pages=3767-3778&rft.issn=2471-285X&rft.eissn=2471-285X&rft.coden=ITETCU&rft_id=info:doi/10.1109/TETCI.2024.3372435&rft_dat=%3Cproquest_RIE%3E3131908814%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3131908814&rft_id=info:pmid/&rft_ieee_id=10475374&rfr_iscdi=true