Enhancing Presentation Slide Generation by LLMs with a Multi-Staged End-to-End Approach
Generating presentation slides from a long document with multimodal elements such as text and images is an important task. This is time consuming and needs domain expertise if done manually. Existing approaches for generating a rich presentation from a document are often semi-automatic or only put a...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Generating presentation slides from a long document with multimodal elements
such as text and images is an important task. This is time consuming and needs
domain expertise if done manually. Existing approaches for generating a rich
presentation from a document are often semi-automatic or only put a flat
summary into the slides ignoring the importance of a good narrative. In this
paper, we address this research gap by proposing a multi-staged end-to-end
model which uses a combination of LLM and VLM. We have experimentally shown
that compared to applying LLMs directly with state-of-the-art prompting, our
proposed multi-staged solution is better in terms of automated metrics and
human evaluation. |
---|---|
DOI: | 10.48550/arxiv.2406.06556 |