RweetMiner: Automatic identification and categorization of help requests on twitter during disasters

•Redefining request under the term “rweet” in the context of social networking sties, as well as defining its primary types and subtypes.•Proposing optimized and effective preprocessing strategy.•Generating n-grams (bag of words) with n = 1, 2, and 3, combining them with each other and rule based fe...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications 2021-08, Vol.176, p.114787, Article 114787
Hauptverfasser: Ullah, Irfan, Khan, Sharifullah, Imran, Muhammad, Lee, Young-Koo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Redefining request under the term “rweet” in the context of social networking sties, as well as defining its primary types and subtypes.•Proposing optimized and effective preprocessing strategy.•Generating n-grams (bag of words) with n = 1, 2, and 3, combining them with each other and rule based features for learning subtle differences between request and non-request tweets, as well as six different types of request tweets.•Store intermediate data to speed up the machine learning development life cycle.•Performance improvement on the request identification and request categorization on Twitter. Catastrophic events create uncertain situations for humanitarian organizations locating and providing aid to affected people. Many people turn to social media during disasters for requesting help and/or providing relief to others. However, the majority of social media posts seeking help could not properly be detected and remained concealed because often they are noisy and ill-formed. Existing systems lack in planning an effective strategy for tweet preprocessing and grasping the contexts of tweets. This research, first of all, formally defines request tweets in the context of social networking sites, hereafter rweets, along with their different primary types and sub-types. Our main contributions are the identification and categorization of rweets. For rweet identification, we employ two approaches, namely a rule-based and logistic regression, and show their high precision and F1 scores. The rweets classification into sub-types such as medical, food, shelter, using logistic regression shows promising results and outperforms exiting works. Finally, we introduce an architecture to store intermediate data to accelerate the development process of the machine learning classifiers.
ISSN:0957-4174
1873-6793
DOI:10.1016/j.eswa.2021.114787