Opening a Free Path to Analyze the Discourse Shift in the Soviet Belarusian Newspaper Zviazda after the Molotov-Ribbentrop Pact
This paper attempts to develop a pipeline designed to convert graphical PDF files of the newspaper Zviazda into usable text data in the Belarusian language with search and visualization options. Apart from punctual conversion scripts to allow navigating between formats, the pipeline relies on freely...
Gespeichert in:
Veröffentlicht in: | Journal of Open Humanities Data 2023-11, Vol.9 (3), p.23-23 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper attempts to develop a pipeline designed to convert graphical PDF files of the newspaper Zviazda into usable text data in the Belarusian language with search and visualization options. Apart from punctual conversion scripts to allow navigating between formats, the pipeline relies on freely available resources in order to process this relatively under-resourced language (at least for freely available resources). This pipeline was designed to include a graph database and to be compatible with data visualization tools. The ultimate goal is to develop a resource to analyze the political discourse in the Soviet Belarusian press during the Second World War. With a view to validating the pipeline, a pilot study was carried out: it aims to visualize some simple manifestations of the Soviet rhetorical shift about Nazi Germany after the signing of the Molotov-Ribbentrop Pact in order to prove that some useful phenomenon can be revealed even with quite noisy data. Keywords: NLP, Belarusian language, Graph databases, Discourse, Soviet press |
---|---|
ISSN: | 2059-481X 2059-481X |
DOI: | 10.5334/johd.133 |