On The Persona-based Summarization of Domain-Specific Documents
ACL 2024 Findings (Association for Computational Linguistics) In an ever-expanding world of domain-specific knowledge, the increasing complexity of consuming, and storing information necessitates the generation of summaries from large information repositories. However, every persona of a domain has...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | ACL 2024 Findings (Association for Computational Linguistics) In an ever-expanding world of domain-specific knowledge, the increasing
complexity of consuming, and storing information necessitates the generation of
summaries from large information repositories. However, every persona of a
domain has different requirements of information and hence their summarization.
For example, in the healthcare domain, a persona-based (such as Doctor, Nurse,
Patient etc.) approach is imperative to deliver targeted medical information
efficiently. Persona-based summarization of domain-specific information by
humans is a high cognitive load task and is generally not preferred. The
summaries generated by two different humans have high variability and do not
scale in cost and subject matter expertise as domains and personas grow.
Further, AI-generated summaries using generic Large Language Models (LLMs) may
not necessarily offer satisfactory accuracy for different domains unless they
have been specifically trained on domain-specific data and can also be very
expensive to use in day-to-day operations. Our contribution in this paper is
two-fold: 1) We present an approach to efficiently fine-tune a domain-specific
small foundation LLM using a healthcare corpus and also show that we can
effectively evaluate the summarization quality using AI-based critiquing. 2) We
further show that AI-based critiquing has good concordance with Human-based
critiquing of the summaries. Hence, such AI-based pipelines to generate
domain-specific persona-based summaries can be easily scaled to other domains
such as legal, enterprise documents, education etc. in a very efficient and
cost-effective manner. |
---|---|
DOI: | 10.48550/arxiv.2406.03986 |