Assessing the reliability of automatic sentiment analysis tools on rating the sentiment of reviews of NHS dental practices in England

Online reviews may act as a rich source of data to assess the quality of dental practices. Assessing the content and sentiment of reviews on a large scale is time consuming and expensive. Automation of the process of assigning sentiment to big data samples of reviews may allow for reviews to be used...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:PloS one 2021-12, Vol.16 (12), p.e0259797-e0259797
Hauptverfasser: Byrne, Matthew, O'Malley, Lucy, Glenny, Anne-Marie, Pretty, Iain, Tickle, Martin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Online reviews may act as a rich source of data to assess the quality of dental practices. Assessing the content and sentiment of reviews on a large scale is time consuming and expensive. Automation of the process of assigning sentiment to big data samples of reviews may allow for reviews to be used as Patient Reported Experience Measures for primary care dentistry. To assess the reliability of three different online sentiment analysis tools (Amazon Comprehend DetectSentiment API (ACDAPI), Google and Monkeylearn) at assessing the sentiment of reviews of dental practices working on National Health Service contracts in the United Kingdom. A Python 3 script was used to mine 15800 reviews from 4803 unique dental practices on the NHS.uk websites between April 2018 -March 2019. A random sample of 270 reviews were rated by the three sentiment analysis tools. These reviews were rated by 3 blinded independent human reviewers and a pooled sentiment score was assigned. Kappa statistics and polychoric evalutaiton were used to assess the level of agreement. Disagreements between the automated and human reviewers were qualitatively assessed. There was good agreement between the sentiment assigned to reviews by the human reviews and ACDAPI (k = 0.660). The Google (k = 0.706) and Monkeylearn (k = 0.728) showed slightly better agreement at the expense of usability on a massive dataset. There were 33 disagreements in rating between ACDAPI and human reviewers, of which n = 16 were due to syntax errors, n = 10 were due to misappropriation of the strength of conflicting emotions and n = 7 were due to a lack of overtly emotive language in the text. There is good agreement between the sentiment of an online review assigned by a group of humans and by cloud-based sentiment analysis. This may allow the use of automated sentiment analysis for quality assessment of dental service provision in the NHS.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0259797