NLP and machine learning to measure peace from news media

“Hate speech” can mobilize violence and destruction.  What are the characteristics of “peace speech” that reflect and support the social processes that maintain peace?  In this study we used a data driven, machine learning approach to identify the words most associated with lower-peace versus higher...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Liebovitch, Larry, Powers, William, Shi, Lin, Chen-Carrel, Allegra, Loustaunau, Philippe, Coleman, Peter
Format: Dataset
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:“Hate speech” can mobilize violence and destruction.  What are the characteristics of “peace speech” that reflect and support the social processes that maintain peace?  In this study we used a data driven, machine learning approach to identify the words most associated with lower-peace versus higher-peace countries. Logistic regression and random forest classifiers were trained using five respected, traditional peace indices: Global Peace Index, Positive Peace Index, World Happiness Index, Fragile States Index, and Human Development Index. The feature inputs into the machine learning model were the word frequencies from the news media in each country and the output classifications were the level of peace in that country.  The machine learning model was successful in properly classifying the level of peace from the news media in a country (both accuracy and F1: 96% - 100%). We also used that trained machine model to create a machine learning peace index that measured the level of peace in countries, including countries not in the training set, which correlated with the average of those five traditional peace indices (r-squared = 0.8349). Using the random forest feature importance method we found that the words in news media in lower-peace countries were characterized by words related to government, order, control and fear (such as government, state, law, security and court), while higher-peace countries were characterized by an increased prevalence of words related to optimism for the future and fun (such as time, like, home, believe and game).
DOI:10.5061/dryad.2v6wwpzv6