CAILIE 1.0: A dataset for Challenge of AI in Law - Information Extraction V1.0

Legal information extraction requires identifying and classifying legal elements from specific legal documents. Considering that information extraction is mainly regarded as the first step in natural language understanding, the quality of legal information extraction results certainly has an immense...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:AI open 2022, Vol.3, p.208-212
Hauptverfasser: Cao, Yu, Sun, Yuanyuan, Xu, Ce, Li, Chunnan, Du, Jinming, Lin, Hongfei
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Legal information extraction requires identifying and classifying legal elements from specific legal documents. Considering that information extraction is mainly regarded as the first step in natural language understanding, the quality of legal information extraction results certainly has an immense impact on the performance of various legal artificial intelligence (AI) downstream tasks. However, Chinese judicial information extraction datasets are very scarce due to the particularity of legal documents. In response to this situation, we constructed a dataset for Challenge of AI in Law - Information Extraction V1.0 (CAILIE 1.0). The following two features of CAILIE are worth highlighting: 1) the entity definition focuses on more fine-grained theft document information, providing more interpretability for downstream legal AI; and 2) we define entity labels with judicial attributes based on natural attribute labels to meet the needs of Chinese judicial practice. We implement some classic models on this dataset. The experimental results show that legal information extraction is still challenging and additional research is required for this task to be solved.
ISSN:2666-6510
2666-6510
DOI:10.1016/j.aiopen.2022.12.002