Extracting Social Determinants of Health from Pediatric Patient Notes Using Large Language Models: Novel Corpus and Methods
Social determinants of health (SDoH) play a critical role in shaping health outcomes, particularly in pediatric populations where interventions can have long-term implications. SDoH are frequently studied in the Electronic Health Record (EHR), which provides a rich repository for diverse patient dat...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Social determinants of health (SDoH) play a critical role in shaping health
outcomes, particularly in pediatric populations where interventions can have
long-term implications. SDoH are frequently studied in the Electronic Health
Record (EHR), which provides a rich repository for diverse patient data. In
this work, we present a novel annotated corpus, the Pediatric Social History
Annotation Corpus (PedSHAC), and evaluate the automatic extraction of detailed
SDoH representations using fine-tuned and in-context learning methods with
Large Language Models (LLMs). PedSHAC comprises annotated social history
sections from 1,260 clinical notes obtained from pediatric patients within the
University of Washington (UW) hospital system. Employing an event-based
annotation scheme, PedSHAC captures ten distinct health determinants to
encompass living and economic stability, prior trauma, education access,
substance use history, and mental health with an overall annotator agreement of
81.9 F1. Our proposed fine-tuning LLM-based extractors achieve high performance
at 78.4 F1 for event arguments. In-context learning approaches with GPT-4
demonstrate promise for reliable SDoH extraction with limited annotated
examples, with extraction performance at 82.3 F1 for event triggers. |
---|---|
DOI: | 10.48550/arxiv.2404.00826 |