Optimizing Cross Domain Sentiment Analysis Using Hidden Markov Continual Progression

With the rapid increase in internet users and customer reviews playing major role in social media gave rise to sentiment analysis. Pre-processing of input text during sentiment analysis eliminates incomplete and noisy data. Typically, sentiment is manifested separately and applying pre-processing mo...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Wangji Wanglu Jishu Xuekan = Journal of Internet Technology 2019-01, Vol.20 (3), p.781-788
Hauptverfasser:	Manivannan, P, Selvi, C S Kanimozhi
Format:	Artikel
Sprache:	chi ; eng
Schlagworte:	Classification Data mining Digital media Error reduction Feature extraction Markov processes Optimization Sentiment analysis Tags Transition probabilities Trigonometric functions
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	With the rapid increase in internet users and customer reviews playing major role in social media gave rise to sentiment analysis. Pre-processing of input text during sentiment analysis eliminates incomplete and noisy data. Typically, sentiment is manifested separately and applying pre-processing model for optimizing crossdomain sentiment classification is highly required. In this paper, a method called Hidden Markov Continual Progression Cosine Similar (HM-CPCS) is proposed to explore the impact of pre-processing and optimize sentiment analysis. First, a measure of subsequent and antecedent probabilities of tags is made using HM-POS Tagger for the given input dataset. Subsequent and antecedent probabilities of tags are obtained by measuring the transition probabilities between states and observations ensuring feature extraction accuracy. Next, the Continual Progression Stemmer continuously stems the text by adding prefix and suffix to form structured words for the given shortcuts and therefore reduce Error Rate Relative to Truncation (ERRT). Finally a Cosine Similarity function is applied to remove stop word for cross-domain sentiment analysis and classification. Experimental analysis shows that HM-CPCS method is able to reduce the time to extract the opinions from reviewers by 46% and improve the accuracy by 9% compared to the state-of-the-art works
ISSN:	1607-9264 2079-4029
DOI:	10.3966/160792642019052003011