Magic Markup: Maintaining Document-External Markup with an LLM
Text documents, including programs, typically have human-readable semantic structure. Historically, programmatic access to these semantics has required explicit in-document tagging. Especially in systems where the text has an execution semantics, this means it is an opt-in feature that is hard to su...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Text documents, including programs, typically have human-readable semantic
structure. Historically, programmatic access to these semantics has required
explicit in-document tagging. Especially in systems where the text has an
execution semantics, this means it is an opt-in feature that is hard to support
properly. Today, language models offer a new method: metadata can be bound to
entities in changing text using a model's human-like understanding of
semantics, with no requirements on the document structure. This method expands
the applications of document annotation, a fundamental operation in program
writing, debugging, maintenance, and presentation. We contribute a system that
employs an intelligent agent to re-tag modified programs, enabling rich
annotations to automatically follow code as it evolves. We also contribute a
formal problem definition, an empirical synthetic benchmark suite, and our
benchmark generator. Our system achieves an accuracy of 90% on our benchmarks
and can replace a document's tags in parallel at a rate of 5 seconds per tag.
While there remains significant room for improvement, we find performance
reliable enough to justify further exploration of applications. |
---|---|
DOI: | 10.48550/arxiv.2403.03481 |