A Hybrid CNN-LSTM Model for Improving Accuracy of Movie Reviews Sentiment Analysis

Nowadays, social media has become a tremendous source of acquiring user’s opinions. With the advancement of technology and sophistication of the internet, a huge amount of data is generated from various sources like social blogs, websites, etc. In recent times, the blogs and websites are the real-ti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Multimedia tools and applications 2019-09, Vol.78 (18), p.26597-26613
Hauptverfasser: Rehman, Anwar Ur, Malik, Ahmad Kamran, Raza, Basit, Ali, Waqar
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Nowadays, social media has become a tremendous source of acquiring user’s opinions. With the advancement of technology and sophistication of the internet, a huge amount of data is generated from various sources like social blogs, websites, etc. In recent times, the blogs and websites are the real-time means of gathering product reviews. However, excessive number of blogs on the cloud has enabled the generation of huge volume of information in different forms like attitudes, opinions, and reviews. Therefore, a dire need emerges to find a method to extract meaningful information from big data, classify it into different categories and predict end user’s behaviors or sentiments. Long Short-Term Memory (LSTM) model and Convolutional Neural Network (CNN) model have been applied to different Natural Language Processing (NLP) tasks with remarkable and effective results. The CNN model efficiently extracts higher level features using convolutional layers and max-pooling layers. The LSTM model is capable to capture long-term dependencies between word sequences. In this study, we propose a hybrid model using LSTM and very deep CNN model named as Hybrid CNN-LSTM Model to overcome the sentiment analysis problem. First, we use Word to Vector (Word2Vc) approach to train initial word embeddings. The Word2Vc translates the text strings into a vector of numeric values, computes distance between words, and makes groups of similar words based on their meanings. Afterword embedding is performed in which the proposed model combines set of features that are extracted by convolution and global max-pooling layers with long term dependencies. The proposed model also uses dropout technology, normalization and a rectified linear unit for accuracy improvement. Our results show that the proposed Hybrid CNN-LSTM Model outperforms traditional deep learning and machine learning techniques in terms of precision, recall, f-measure, and accuracy. Our approach achieved competitive results using state-of-the-art techniques on the IMDB movie review dataset and Amazon movie reviews dataset.
ISSN:1380-7501
1573-7721
DOI:10.1007/s11042-019-07788-7