Large Language Model Distilling Medication Recommendation Model
The recommendation of medication is a vital aspect of intelligent healthcare systems, as it involves prescribing the most suitable drugs based on a patient's specific health needs. Unfortunately, many sophisticated models currently in use tend to overlook the nuanced semantics of medical data,...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The recommendation of medication is a vital aspect of intelligent healthcare
systems, as it involves prescribing the most suitable drugs based on a
patient's specific health needs. Unfortunately, many sophisticated models
currently in use tend to overlook the nuanced semantics of medical data, while
only relying heavily on identities. Furthermore, these models face significant
challenges in handling cases involving patients who are visiting the hospital
for the first time, as they lack prior prescription histories to draw upon. To
tackle these issues, we harness the powerful semantic comprehension and
input-agnostic characteristics of Large Language Models (LLMs). Our research
aims to transform existing medication recommendation methodologies using LLMs.
In this paper, we introduce a novel approach called Large Language Model
Distilling Medication Recommendation (LEADER). We begin by creating appropriate
prompt templates that enable LLMs to suggest medications effectively. However,
the straightforward integration of LLMs into recommender systems leads to an
out-of-corpus issue specific to drugs. We handle it by adapting the LLMs with a
novel output layer and a refined tuning loss function. Although LLM-based
models exhibit remarkable capabilities, they are plagued by high computational
costs during inference, which is impractical for the healthcare sector. To
mitigate this, we have developed a feature-level knowledge distillation
technique, which transfers the LLM's proficiency to a more compact model.
Extensive experiments conducted on two real-world datasets, MIMIC-III and
MIMIC-IV, demonstrate that our proposed model not only delivers effective
results but also is efficient. To ease the reproducibility of our experiments,
we release the implementation code online. |
---|---|
DOI: | 10.48550/arxiv.2402.02803 |