A hybrid approach to the small unannotated corpus-based language comparison and its application to the Old East Slavic charters - Supplementary material 3 (Modern standard Slavic lects)

Modern standard Slavic lects (Croatian, Slovak, Slovenian) General description The dataset consists of texts, written in three modern stanard Slavic lects: Croatian, Slovak, and Slovenian. The texts are parallel in order to compensate for the possible genre influences. The text is John’s Gospel in e...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Afanasev, Ilia
Format: Dataset
Sprache:hrv
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Modern standard Slavic lects (Croatian, Slovak, Slovenian) General description The dataset consists of texts, written in three modern stanard Slavic lects: Croatian, Slovak, and Slovenian. The texts are parallel in order to compensate for the possible genre influences. The text is John’s Gospel in each of the given languages. Sources Croatian original text is from the Ivan Šarić’s translation of New Testament. Slovenian text is from the standard Slovenian translation of the New Testament. Slovak text is from the modern Catholic translation of the New Testament. The data statement is available among the downloadable files. How-to This section contains the tutorials that allow to use this data with the intended pipelines. Corpus-based distance measurement package The source code for package is available here, the manual is available in the README section of the repository. To use this dataset for the measurement of distance between Slovak, Croatian and Slovenian lects, and their subsequent clusterisation, following steps should be completed: Download the Jupyter notebook that streamlines the package use. Download the dataset. Put the dataset into a selected folder on your computer (make sure there are no other files within this folder). Insert the path to the directory into CONTENT_DIR variable in the Jupyter notebook. Run the notebook, adjusting the parameters, if necessary.  
DOI:10.5281/zenodo.14148561