Comparison of Autoencoder Encodings for ECG Representation in Downstream Prediction Tasks

The electrocardiogram (ECG) is an inexpensive and widely available tool for cardiovascular assessment. Despite its standardized format and small file size, the high complexity and inter-individual variability of ECG signals (typically a 60,000-size vector) make it challenging to use in deep learning...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Harvey, Christopher J, Shomaji, Sumaiya, Yao, Zijun, Noheria, Amit
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Harvey, Christopher J
Shomaji, Sumaiya
Yao, Zijun
Noheria, Amit
description The electrocardiogram (ECG) is an inexpensive and widely available tool for cardiovascular assessment. Despite its standardized format and small file size, the high complexity and inter-individual variability of ECG signals (typically a 60,000-size vector) make it challenging to use in deep learning models, especially when only small datasets are available. This study addresses these challenges by exploring feature generation methods from representative beat ECGs, focusing on Principal Component Analysis (PCA) and Autoencoders to reduce data complexity. We introduce three novel Variational Autoencoder (VAE) variants: Stochastic Autoencoder (SAE), Annealed beta-VAE (Abeta-VAE), and cyclical beta-VAE (Cbeta-VAE), and compare their effectiveness in maintaining signal fidelity and enhancing downstream prediction tasks. The Abeta-VAE achieved superior signal reconstruction, reducing the mean absolute error (MAE) to 15.7 plus-minus 3.2 microvolts, which is at the level of signal noise. Moreover, the SAE encodings, when combined with ECG summary features, improved the prediction of reduced Left Ventricular Ejection Fraction (LVEF), achieving an area under the receiver operating characteristic curve (AUROC) of 0.901. This performance nearly matches the 0.910 AUROC of state-of-the-art CNN models but requires significantly less data and computational resources. Our findings demonstrate that these VAE encodings are not only effective in simplifying ECG data but also provide a practical solution for applying deep learning in contexts with limited-scale labeled training data.
doi_str_mv 10.48550/arxiv.2410.02937
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2410_02937</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2410_02937</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2410_029373</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMgEKGBhZGptzMkQ65-cWJBZlFufnKeSnKTiWluSn5iXnp6QWKbiC6My89GKFtHwgz9ldISi1oCi1ODWvJLEkE6g-M0_BJb88r7ikKDUxVyGgKDUlMxksEZJYnF3Mw8CalphTnMoLpbkZ5N1cQ5w9dMGOiC8oysxNLKqMBzkmHuwYY8IqAIG-QB8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Comparison of Autoencoder Encodings for ECG Representation in Downstream Prediction Tasks</title><source>arXiv.org</source><creator>Harvey, Christopher J ; Shomaji, Sumaiya ; Yao, Zijun ; Noheria, Amit</creator><creatorcontrib>Harvey, Christopher J ; Shomaji, Sumaiya ; Yao, Zijun ; Noheria, Amit</creatorcontrib><description>The electrocardiogram (ECG) is an inexpensive and widely available tool for cardiovascular assessment. Despite its standardized format and small file size, the high complexity and inter-individual variability of ECG signals (typically a 60,000-size vector) make it challenging to use in deep learning models, especially when only small datasets are available. This study addresses these challenges by exploring feature generation methods from representative beat ECGs, focusing on Principal Component Analysis (PCA) and Autoencoders to reduce data complexity. We introduce three novel Variational Autoencoder (VAE) variants: Stochastic Autoencoder (SAE), Annealed beta-VAE (Abeta-VAE), and cyclical beta-VAE (Cbeta-VAE), and compare their effectiveness in maintaining signal fidelity and enhancing downstream prediction tasks. The Abeta-VAE achieved superior signal reconstruction, reducing the mean absolute error (MAE) to 15.7 plus-minus 3.2 microvolts, which is at the level of signal noise. Moreover, the SAE encodings, when combined with ECG summary features, improved the prediction of reduced Left Ventricular Ejection Fraction (LVEF), achieving an area under the receiver operating characteristic curve (AUROC) of 0.901. This performance nearly matches the 0.910 AUROC of state-of-the-art CNN models but requires significantly less data and computational resources. Our findings demonstrate that these VAE encodings are not only effective in simplifying ECG data but also provide a practical solution for applying deep learning in contexts with limited-scale labeled training data.</description><identifier>DOI: 10.48550/arxiv.2410.02937</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2024-10</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2410.02937$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2410.02937$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Harvey, Christopher J</creatorcontrib><creatorcontrib>Shomaji, Sumaiya</creatorcontrib><creatorcontrib>Yao, Zijun</creatorcontrib><creatorcontrib>Noheria, Amit</creatorcontrib><title>Comparison of Autoencoder Encodings for ECG Representation in Downstream Prediction Tasks</title><description>The electrocardiogram (ECG) is an inexpensive and widely available tool for cardiovascular assessment. Despite its standardized format and small file size, the high complexity and inter-individual variability of ECG signals (typically a 60,000-size vector) make it challenging to use in deep learning models, especially when only small datasets are available. This study addresses these challenges by exploring feature generation methods from representative beat ECGs, focusing on Principal Component Analysis (PCA) and Autoencoders to reduce data complexity. We introduce three novel Variational Autoencoder (VAE) variants: Stochastic Autoencoder (SAE), Annealed beta-VAE (Abeta-VAE), and cyclical beta-VAE (Cbeta-VAE), and compare their effectiveness in maintaining signal fidelity and enhancing downstream prediction tasks. The Abeta-VAE achieved superior signal reconstruction, reducing the mean absolute error (MAE) to 15.7 plus-minus 3.2 microvolts, which is at the level of signal noise. Moreover, the SAE encodings, when combined with ECG summary features, improved the prediction of reduced Left Ventricular Ejection Fraction (LVEF), achieving an area under the receiver operating characteristic curve (AUROC) of 0.901. This performance nearly matches the 0.910 AUROC of state-of-the-art CNN models but requires significantly less data and computational resources. Our findings demonstrate that these VAE encodings are not only effective in simplifying ECG data but also provide a practical solution for applying deep learning in contexts with limited-scale labeled training data.</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMgEKGBhZGptzMkQ65-cWJBZlFufnKeSnKTiWluSn5iXnp6QWKbiC6My89GKFtHwgz9ldISi1oCi1ODWvJLEkE6g-M0_BJb88r7ikKDUxVyGgKDUlMxksEZJYnF3Mw8CalphTnMoLpbkZ5N1cQ5w9dMGOiC8oysxNLKqMBzkmHuwYY8IqAIG-QB8</recordid><startdate>20241003</startdate><enddate>20241003</enddate><creator>Harvey, Christopher J</creator><creator>Shomaji, Sumaiya</creator><creator>Yao, Zijun</creator><creator>Noheria, Amit</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241003</creationdate><title>Comparison of Autoencoder Encodings for ECG Representation in Downstream Prediction Tasks</title><author>Harvey, Christopher J ; Shomaji, Sumaiya ; Yao, Zijun ; Noheria, Amit</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2410_029373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Harvey, Christopher J</creatorcontrib><creatorcontrib>Shomaji, Sumaiya</creatorcontrib><creatorcontrib>Yao, Zijun</creatorcontrib><creatorcontrib>Noheria, Amit</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Harvey, Christopher J</au><au>Shomaji, Sumaiya</au><au>Yao, Zijun</au><au>Noheria, Amit</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparison of Autoencoder Encodings for ECG Representation in Downstream Prediction Tasks</atitle><date>2024-10-03</date><risdate>2024</risdate><abstract>The electrocardiogram (ECG) is an inexpensive and widely available tool for cardiovascular assessment. Despite its standardized format and small file size, the high complexity and inter-individual variability of ECG signals (typically a 60,000-size vector) make it challenging to use in deep learning models, especially when only small datasets are available. This study addresses these challenges by exploring feature generation methods from representative beat ECGs, focusing on Principal Component Analysis (PCA) and Autoencoders to reduce data complexity. We introduce three novel Variational Autoencoder (VAE) variants: Stochastic Autoencoder (SAE), Annealed beta-VAE (Abeta-VAE), and cyclical beta-VAE (Cbeta-VAE), and compare their effectiveness in maintaining signal fidelity and enhancing downstream prediction tasks. The Abeta-VAE achieved superior signal reconstruction, reducing the mean absolute error (MAE) to 15.7 plus-minus 3.2 microvolts, which is at the level of signal noise. Moreover, the SAE encodings, when combined with ECG summary features, improved the prediction of reduced Left Ventricular Ejection Fraction (LVEF), achieving an area under the receiver operating characteristic curve (AUROC) of 0.901. This performance nearly matches the 0.910 AUROC of state-of-the-art CNN models but requires significantly less data and computational resources. Our findings demonstrate that these VAE encodings are not only effective in simplifying ECG data but also provide a practical solution for applying deep learning in contexts with limited-scale labeled training data.</abstract><doi>10.48550/arxiv.2410.02937</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2410.02937
ispartof
issn
language eng
recordid cdi_arxiv_primary_2410_02937
source arXiv.org
subjects Computer Science - Learning
title Comparison of Autoencoder Encodings for ECG Representation in Downstream Prediction Tasks
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T04%3A32%3A26IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparison%20of%20Autoencoder%20Encodings%20for%20ECG%20Representation%20in%20Downstream%20Prediction%20Tasks&rft.au=Harvey,%20Christopher%20J&rft.date=2024-10-03&rft_id=info:doi/10.48550/arxiv.2410.02937&rft_dat=%3Carxiv_GOX%3E2410_02937%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true