Assessing the reliability of automatic sentiment analysis tools on rating the sentiment of reviews of NHS dental practices in England
Online reviews may act as a rich source of data to assess the quality of dental practices. Assessing the content and sentiment of reviews on a large scale is time consuming and expensive. Automation of the process of assigning sentiment to big data samples of reviews may allow for reviews to be used...
Gespeichert in:
Veröffentlicht in: | PloS one 2021-12, Vol.16 (12), p.e0259797-e0259797 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Online reviews may act as a rich source of data to assess the quality of dental practices. Assessing the content and sentiment of reviews on a large scale is time consuming and expensive. Automation of the process of assigning sentiment to big data samples of reviews may allow for reviews to be used as Patient Reported Experience Measures for primary care dentistry.
To assess the reliability of three different online sentiment analysis tools (Amazon Comprehend DetectSentiment API (ACDAPI), Google and Monkeylearn) at assessing the sentiment of reviews of dental practices working on National Health Service contracts in the United Kingdom.
A Python 3 script was used to mine 15800 reviews from 4803 unique dental practices on the NHS.uk websites between April 2018 -March 2019. A random sample of 270 reviews were rated by the three sentiment analysis tools. These reviews were rated by 3 blinded independent human reviewers and a pooled sentiment score was assigned. Kappa statistics and polychoric evalutaiton were used to assess the level of agreement. Disagreements between the automated and human reviewers were qualitatively assessed.
There was good agreement between the sentiment assigned to reviews by the human reviews and ACDAPI (k = 0.660). The Google (k = 0.706) and Monkeylearn (k = 0.728) showed slightly better agreement at the expense of usability on a massive dataset. There were 33 disagreements in rating between ACDAPI and human reviewers, of which n = 16 were due to syntax errors, n = 10 were due to misappropriation of the strength of conflicting emotions and n = 7 were due to a lack of overtly emotive language in the text.
There is good agreement between the sentiment of an online review assigned by a group of humans and by cloud-based sentiment analysis. This may allow the use of automated sentiment analysis for quality assessment of dental service provision in the NHS. |
---|---|
ISSN: | 1932-6203 1932-6203 |
DOI: | 10.1371/journal.pone.0259797 |