English/Turkish Wikipedia Named-Entity Recognition and Text Categorization Dataset

TWNERTC and EWNERTC are collections of automatically categorized and annotated sentences obtained from Turkish and English Wikipedia for named-entity recognition and text categorization. Firstly, we construct large-scale gazetteers by using a graph crawler algorithm to extract relevant entity and do...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Sahin, H. Bahadir
Format: Dataset
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!