Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data
During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framew...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on emerging topics in computational intelligence 2024-12, Vol.8 (6), p.3767-3778 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 3778 |
---|---|
container_issue | 6 |
container_start_page | 3767 |
container_title | IEEE transactions on emerging topics in computational intelligence |
container_volume | 8 |
creator | Hssayeni, Murtadha D. Ghoraani, Behnaz |
description | During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framework of Conditional Generative Adversarial Networks (cGANs) for improving deep regression models for time-series data with an imbalanced and incomplete distribution. First, we investigated the potential of using a vanilla cGAN as a data imputation to improve the generalizability of the developed models to unseen data in such datasets. Next, we proposed a modified cGAN architecture with improved extrapolation and generalizability of the regression models. Our investigations used an imbalanced synthetic non-stationary dataset, a real-world dataset in Parkinson's disease (PD) application domain, and one publicly-available dataset for Negative Affect (NA) estimation. We found that vanilla cGAN failed to generate realistic time-series data due to severe mode collapse, limiting its application as a data imputation for imbalanced and incomplete data. Importantly, the proposed cGAN framework significantly improved extrapolation and generalizability for the prediction of regression scores with an average improvement of 56%, 34%, and 18%, respectively, in mean absolute error for the synthetic, PD, and NA datasets when compared with traditional Convolutional Neural Networks. The codes are publicly available on Github. |
doi_str_mv | 10.1109/TETCI.2024.3372435 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_journals_3131908814</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10475374</ieee_id><sourcerecordid>3131908814</sourcerecordid><originalsourceid>FETCH-LOGICAL-c247t-36fd4407369493f79599afed50438ef499381c13fe9459f80326b1d09e49908e3</originalsourceid><addsrcrecordid>eNpNkE9LAzEQxYMoWGq_gHgIeN6aZJLu5iit1oWKoBW8hXR3UrbsP5PtwW_f1PbQ0zyY92YeP0LuOZtyzvTT-mU9z6eCCTkFSIUEdUVGQqY8EZn6ub7Qt2QSwo4xJrTioOSILBeIPf3ErccQqq6l712JddVuqes8zZuNrW1bYEltW9K8Lbqmr3FAuq4aTL7QVxjowg72jtw4WwecnOeYfL_GWm_J6mOZz59XSRFLDAnMXCklS2GmpQaXaqW1dVgqJiFDJ7WGjBccHGqptMsYiNmGl0xjXLEMYUweT3d73_3uMQxm1-19G18a4MCjJ-MyusTJVfguBI_O9L5qrP8znJkjM_PPzByZmTOzGHo4hSpEvAjIVEEq4QDT7mXq</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3131908814</pqid></control><display><type>article</type><title>Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data</title><source>IEEE Electronic Library (IEL)</source><creator>Hssayeni, Murtadha D. ; Ghoraani, Behnaz</creator><creatorcontrib>Hssayeni, Murtadha D. ; Ghoraani, Behnaz</creatorcontrib><description>During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framework of Conditional Generative Adversarial Networks (cGANs) for improving deep regression models for time-series data with an imbalanced and incomplete distribution. First, we investigated the potential of using a vanilla cGAN as a data imputation to improve the generalizability of the developed models to unseen data in such datasets. Next, we proposed a modified cGAN architecture with improved extrapolation and generalizability of the regression models. Our investigations used an imbalanced synthetic non-stationary dataset, a real-world dataset in Parkinson's disease (PD) application domain, and one publicly-available dataset for Negative Affect (NA) estimation. We found that vanilla cGAN failed to generate realistic time-series data due to severe mode collapse, limiting its application as a data imputation for imbalanced and incomplete data. Importantly, the proposed cGAN framework significantly improved extrapolation and generalizability for the prediction of regression scores with an average improvement of 56%, 34%, and 18%, respectively, in mean absolute error for the synthetic, PD, and NA datasets when compared with traditional Convolutional Neural Networks. The codes are publicly available on Github.</description><identifier>ISSN: 2471-285X</identifier><identifier>EISSN: 2471-285X</identifier><identifier>DOI: 10.1109/TETCI.2024.3372435</identifier><identifier>CODEN: ITETCU</identifier><language>eng</language><publisher>Piscataway: IEEE</publisher><subject>Artificial neural networks ; Convolutional neural networks ; Data integrity ; Data models ; Datasets ; Deep regression modeling ; Extrapolation ; Feature extraction ; Generative adversarial networks ; Generators ; imbalanced and incomplete data ; Parkinson's disease ; Regression analysis ; Regression models ; Testing ; Time series ; Time series analysis ; time-series data ; Training</subject><ispartof>IEEE transactions on emerging topics in computational intelligence, 2024-12, Vol.8 (6), p.3767-3778</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c247t-36fd4407369493f79599afed50438ef499381c13fe9459f80326b1d09e49908e3</cites><orcidid>0000-0003-0075-7663 ; 0000-0002-8588-4639</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10475374$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10475374$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Hssayeni, Murtadha D.</creatorcontrib><creatorcontrib>Ghoraani, Behnaz</creatorcontrib><title>Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data</title><title>IEEE transactions on emerging topics in computational intelligence</title><addtitle>TETCI</addtitle><description>During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framework of Conditional Generative Adversarial Networks (cGANs) for improving deep regression models for time-series data with an imbalanced and incomplete distribution. First, we investigated the potential of using a vanilla cGAN as a data imputation to improve the generalizability of the developed models to unseen data in such datasets. Next, we proposed a modified cGAN architecture with improved extrapolation and generalizability of the regression models. Our investigations used an imbalanced synthetic non-stationary dataset, a real-world dataset in Parkinson's disease (PD) application domain, and one publicly-available dataset for Negative Affect (NA) estimation. We found that vanilla cGAN failed to generate realistic time-series data due to severe mode collapse, limiting its application as a data imputation for imbalanced and incomplete data. Importantly, the proposed cGAN framework significantly improved extrapolation and generalizability for the prediction of regression scores with an average improvement of 56%, 34%, and 18%, respectively, in mean absolute error for the synthetic, PD, and NA datasets when compared with traditional Convolutional Neural Networks. The codes are publicly available on Github.</description><subject>Artificial neural networks</subject><subject>Convolutional neural networks</subject><subject>Data integrity</subject><subject>Data models</subject><subject>Datasets</subject><subject>Deep regression modeling</subject><subject>Extrapolation</subject><subject>Feature extraction</subject><subject>Generative adversarial networks</subject><subject>Generators</subject><subject>imbalanced and incomplete data</subject><subject>Parkinson's disease</subject><subject>Regression analysis</subject><subject>Regression models</subject><subject>Testing</subject><subject>Time series</subject><subject>Time series analysis</subject><subject>time-series data</subject><subject>Training</subject><issn>2471-285X</issn><issn>2471-285X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNpNkE9LAzEQxYMoWGq_gHgIeN6aZJLu5iit1oWKoBW8hXR3UrbsP5PtwW_f1PbQ0zyY92YeP0LuOZtyzvTT-mU9z6eCCTkFSIUEdUVGQqY8EZn6ub7Qt2QSwo4xJrTioOSILBeIPf3ErccQqq6l712JddVuqes8zZuNrW1bYEltW9K8Lbqmr3FAuq4aTL7QVxjowg72jtw4WwecnOeYfL_GWm_J6mOZz59XSRFLDAnMXCklS2GmpQaXaqW1dVgqJiFDJ7WGjBccHGqptMsYiNmGl0xjXLEMYUweT3d73_3uMQxm1-19G18a4MCjJ-MyusTJVfguBI_O9L5qrP8znJkjM_PPzByZmTOzGHo4hSpEvAjIVEEq4QDT7mXq</recordid><startdate>20241201</startdate><enddate>20241201</enddate><creator>Hssayeni, Murtadha D.</creator><creator>Ghoraani, Behnaz</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><orcidid>https://orcid.org/0000-0003-0075-7663</orcidid><orcidid>https://orcid.org/0000-0002-8588-4639</orcidid></search><sort><creationdate>20241201</creationdate><title>Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data</title><author>Hssayeni, Murtadha D. ; Ghoraani, Behnaz</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c247t-36fd4407369493f79599afed50438ef499381c13fe9459f80326b1d09e49908e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Artificial neural networks</topic><topic>Convolutional neural networks</topic><topic>Data integrity</topic><topic>Data models</topic><topic>Datasets</topic><topic>Deep regression modeling</topic><topic>Extrapolation</topic><topic>Feature extraction</topic><topic>Generative adversarial networks</topic><topic>Generators</topic><topic>imbalanced and incomplete data</topic><topic>Parkinson's disease</topic><topic>Regression analysis</topic><topic>Regression models</topic><topic>Testing</topic><topic>Time series</topic><topic>Time series analysis</topic><topic>time-series data</topic><topic>Training</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hssayeni, Murtadha D.</creatorcontrib><creatorcontrib>Ghoraani, Behnaz</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><jtitle>IEEE transactions on emerging topics in computational intelligence</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Hssayeni, Murtadha D.</au><au>Ghoraani, Behnaz</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data</atitle><jtitle>IEEE transactions on emerging topics in computational intelligence</jtitle><stitle>TETCI</stitle><date>2024-12-01</date><risdate>2024</risdate><volume>8</volume><issue>6</issue><spage>3767</spage><epage>3778</epage><pages>3767-3778</pages><issn>2471-285X</issn><eissn>2471-285X</eissn><coden>ITETCU</coden><abstract>During the collection of time-series data, many reasons lead to imbalanced and incomplete datasets. Consequently, it becomes challenging to develop deep convolutional models without suffering from overfitting. Our objective in this paper was to investigate an emerging but rather underutilized framework of Conditional Generative Adversarial Networks (cGANs) for improving deep regression models for time-series data with an imbalanced and incomplete distribution. First, we investigated the potential of using a vanilla cGAN as a data imputation to improve the generalizability of the developed models to unseen data in such datasets. Next, we proposed a modified cGAN architecture with improved extrapolation and generalizability of the regression models. Our investigations used an imbalanced synthetic non-stationary dataset, a real-world dataset in Parkinson's disease (PD) application domain, and one publicly-available dataset for Negative Affect (NA) estimation. We found that vanilla cGAN failed to generate realistic time-series data due to severe mode collapse, limiting its application as a data imputation for imbalanced and incomplete data. Importantly, the proposed cGAN framework significantly improved extrapolation and generalizability for the prediction of regression scores with an average improvement of 56%, 34%, and 18%, respectively, in mean absolute error for the synthetic, PD, and NA datasets when compared with traditional Convolutional Neural Networks. The codes are publicly available on Github.</abstract><cop>Piscataway</cop><pub>IEEE</pub><doi>10.1109/TETCI.2024.3372435</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0003-0075-7663</orcidid><orcidid>https://orcid.org/0000-0002-8588-4639</orcidid></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 2471-285X |
ispartof | IEEE transactions on emerging topics in computational intelligence, 2024-12, Vol.8 (6), p.3767-3778 |
issn | 2471-285X 2471-285X |
language | eng |
recordid | cdi_proquest_journals_3131908814 |
source | IEEE Electronic Library (IEL) |
subjects | Artificial neural networks Convolutional neural networks Data integrity Data models Datasets Deep regression modeling Extrapolation Feature extraction Generative adversarial networks Generators imbalanced and incomplete data Parkinson's disease Regression analysis Regression models Testing Time series Time series analysis time-series data Training |
title | Deep Regression Modeling for Imbalanced and Incomplete Time-Series Data |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T14%3A33%3A05IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Deep%20Regression%20Modeling%20for%20Imbalanced%20and%20Incomplete%20Time-Series%20Data&rft.jtitle=IEEE%20transactions%20on%20emerging%20topics%20in%20computational%20intelligence&rft.au=Hssayeni,%20Murtadha%20D.&rft.date=2024-12-01&rft.volume=8&rft.issue=6&rft.spage=3767&rft.epage=3778&rft.pages=3767-3778&rft.issn=2471-285X&rft.eissn=2471-285X&rft.coden=ITETCU&rft_id=info:doi/10.1109/TETCI.2024.3372435&rft_dat=%3Cproquest_RIE%3E3131908814%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3131908814&rft_id=info:pmid/&rft_ieee_id=10475374&rfr_iscdi=true |