A Cross-Lingual Statutory Article Retrieval Dataset for Taiwan Legal Studies
This paper introduces a cross-lingual statutory article retrieval (SAR) dataset designed to enhance legal information retrieval in multilingual settings. Our dataset features spoken-language-style legal inquiries in English, paired with corresponding Chinese versions and relevant statutes, covering...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper introduces a cross-lingual statutory article retrieval (SAR)
dataset designed to enhance legal information retrieval in multilingual
settings. Our dataset features spoken-language-style legal inquiries in
English, paired with corresponding Chinese versions and relevant statutes,
covering all Taiwanese civil, criminal, and administrative laws. This dataset
aims to improve access to legal information for non-native speakers,
particularly for foreign nationals in Taiwan. We propose several LLM-based
methods as baselines for evaluating retrieval effectiveness, focusing on
mitigating translation errors and improving cross-lingual retrieval
performance. Our work provides a valuable resource for developing inclusive
legal information retrieval systems. |
---|---|
DOI: | 10.48550/arxiv.2410.11450 |