Learning Hidden Patterns from Patient Multivariate Time Series Data Using Convolutional Neural Networks: A Case Study of Healthcare Cost Prediction
Objective: To develop an effective and scalable individual-level patient cost prediction method by automatically learning hidden temporal patterns from multivariate time series data in patient insurance claims using a convolutional neural network (CNN) architecture. Methods: We used three years of m...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Objective: To develop an effective and scalable individual-level patient cost
prediction method by automatically learning hidden temporal patterns from
multivariate time series data in patient insurance claims using a convolutional
neural network (CNN) architecture.
Methods: We used three years of medical and pharmacy claims data from 2013 to
2016 from a healthcare insurer, where data from the first two years were used
to build the model to predict costs in the third year. The data consisted of
the multivariate time series of cost, visit and medical features that were
shaped as images of patients' health status (i.e., matrices with time windows
on one dimension and the medical, visit and cost features on the other
dimension). Patients' multivariate time series images were given to a CNN
method with a proposed architecture. After hyper-parameter tuning, the proposed
architecture consisted of three building blocks of convolution and pooling
layers with an LReLU activation function and a customized kernel size at each
layer for healthcare data. The proposed CNN learned temporal patterns became
inputs to a fully connected layer.
Conclusions: Feature learning through the proposed CNN configuration
significantly improved individual-level healthcare cost prediction. The
proposed CNN was able to outperform temporal pattern detection methods that
look for a pre-defined set of pattern shapes, since it is capable of extracting
a variable number of patterns with various shapes. Temporal patterns learned
from medical, visit and cost data made significant contributions to the
prediction performance. Hyper-parameter tuning showed that considering
three-month data patterns has the highest prediction accuracy. Our results
showed that patients' images extracted from multivariate time series data are
different from regular images, and hence require unique designs of CNN
architectures. |
---|---|
DOI: | 10.48550/arxiv.2009.06783 |