By the numbers: The magic of numerical intelligence in text analytic systems

There is a growing recognition among MIS researchers and practitioners that social media provide a valuable source of business intelligence. Unearthing relevant and useful information among the voluminous postings remains a challenge, however. Automated methods based on text mining have made signifi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Decision Support Systems 2018-09, Vol.113, p.86-98
Hauptverfasser: Gruss, Richard, Abrahams, Alan S., Fan, Weiguo, Wang, G. Alan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:There is a growing recognition among MIS researchers and practitioners that social media provide a valuable source of business intelligence. Unearthing relevant and useful information among the voluminous postings remains a challenge, however. Automated methods based on text mining have made significant progress in recent years by discovering a variety of new methods and features. This study adds to this stream by introducing a novel text mining procedure centered around numerical expressions contained in text documents. In this method, numerical expressions are extracted, categorized, and binned, and their presence and magnitude are stored as document features. We demonstrate, using a case study from the automotive industry, that numerical expressions can be reliably identified, and that these numerical features enable improvements in document classification. As an extension to this case study, we contribute a decision support system for managing product quality using both textual and numerical attributes. •Numerical tokens are often underutilized in text analytic systems.•We propose a procedure for finding and classifying numerical tokens, using postings in an online forum as a case study.•We demonstrate that the numbers can be reliably classified and discretized using supervised machine learning.•We further show that the number features can enhance product defect discovery.•A Post Market Quality Surveillance decision support system that leverages numerical tokens is designed and recommended.
ISSN:0167-9236
1873-5797
DOI:10.1016/j.dss.2018.07.004