Abstract 4191: Assessing performance of biomarker extraction from electronic health records: Data augmentation methods for a hierarchical self-attention network (HiSAN)
Background: Extraction of HER2 status from electronic health records (EHR) may expedite clinical trials matching and be used for survivorship research. Deep learning (DL) algorithms have potential to extract this data; however, inherent class imbalance leads to reduced model performance. We compare...
Gespeichert in:
Veröffentlicht in: | Cancer research (Chicago, Ill.) Ill.), 2023-04, Vol.83 (7_Supplement), p.4191-4191 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Background: Extraction of HER2 status from electronic health records (EHR) may expedite clinical trials matching and be used for survivorship research. Deep learning (DL) algorithms have potential to extract this data; however, inherent class imbalance leads to reduced model performance. We compare state of the art strategies to handle class imbalance in models trained to extract HER2 status. This comparative analysis may be used as a guideline for HER2 extraction.
Methods: 680,117 pathology reports collected from 2017-2021 by the National Cancer Institutes’ Surveillance, Epidemiology, and End-Results (SEER) program were used for this study. Pathology reports are manually labelled by cancer registrars as HER2 -, HER2+, or Unknown (class ratio 65%, 11%, 24% respectively). We compare six data augmentation (DA) methods: balanced frequency weighting, ROS while up-sampling HER2+ by 595%, RUS while down-sampling HER2- by 83%, SMOTE, ADASYN, and SMOTE-Tomek. As a comparison we consider the HiSAN model i.e., a DL architecture currently used by SEER for automatic classification of reports.
Result: Applying DA strategies did not improve the performance of the HiSAN model (Table 1). Frequency based class-weighting (Acc=0.78), ROS (Acc=0.81), and RUS (Acc=0.80) perform worse than the baseline model, suggesting simple data augmentation methods do not boost performance for this task. Advanced oversampling with SMOTE (Acc=0.88) and ADASYN (Acc=0.88) perform better than simple approaches, but do not improve the predictive accuracy of the baseline HiSAN.
Conclusion: Common DA methods do not improve the performance of the HiSAN biomaker method. While the overall accuracy of the baseline HiSAN model is quite high, other methods for improved accuracy should be explored.
Table 1. Method Accuracy (Acc) Sensitivity Specificity Precision HiSAN (Baseline) 0.8898 0.7583 0.8944 0.8507 Frequency class weighting 0.7826 0.8033 0.8932 0.7040 Random Over Sampling (ROS) 0.8057 0.7672 0.8883 0.7130 Random Under Sampling (RUS) 0.8042 0.7883 0.8930 0.6902 Synthetic Minority Oversampling Technique (SMOTE) 0.8796 0.7341 0.8827 0.8340 Adaptive Synthetic Sampling (ADASYN) 0.8764 0.7467 0.8863 0.8142 SMOTE-Tomek 0.8823 0.7620 0.8956 0.8200
Citation Format: Shalini Priya, Alina Peluso, Mayanka Chandra Shekhar, Ioana Danciu, Jordan Miller, Heidi A. Hanson. Assessing performance of biomarker extraction from electronic health records: Data augmentation methods for a hierarchical self-attention networ |
---|---|
ISSN: | 1538-7445 1538-7445 |
DOI: | 10.1158/1538-7445.AM2023-4191 |