LLM GPT-3.5 study for sentiment analysis across Utkarsh server, Ohio supercomputer, Google Colab and PC
•Study of sentiment analysis models on Twitter using GPT-3.5.•Implemented BiLSTM, CNN, GRU and RNN models on various platforms.•Assessed models using precision, recall, F1-Score and resource metrics.•Optimized models for cost-effective performance in research environments with available resources. T...
Gespeichert in:
Veröffentlicht in: | Results in engineering 2024-12, Vol.24, p.103218, Article 103218 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Study of sentiment analysis models on Twitter using GPT-3.5.•Implemented BiLSTM, CNN, GRU and RNN models on various platforms.•Assessed models using precision, recall, F1-Score and resource metrics.•Optimized models for cost-effective performance in research environments with available resources.
The major objective of the present research is to inspect sentiment analysis models that have been trained on Twitter corpus by utilising the Large Language Model (LLM) gpt-3–5-turbo-16k version of the Generative Pretrained Transformer (GPT 3.5) model. Such trained models include the Bidirectional long short-term memory neural network (BiLSTM), Convolutional Neural Networks, Gated Recurrent Unit and Recurrent Neural Network which were used to perform computational tasks on the Ohio Supercomputer and Utkarsh Server, in comparison to work conducted on Google Colab and a Personal Computer (PC).This research also looks at the performance as well as the computational aspects of these models in terms of accuracy, recall, F1-score, time/memory complexity and resource requirements (CPU/GPU) throughout the training and testing phases of each model. The training accuracies in this case were concentrated between 49.91 % and 99.98 % while those of the testing accuracies of the models accounts for about 50.00–75.00 %. For instance, models including BiLSTM and RNN usually exhibit more time complexity because of the nature of the models (sequential computation), on the contrary, CNNs are less time-consuming and are more effective in terms of storage modifying layered architecture. The use of supercomputers and specialized servers reduces training time, but resource constraints on platforms such as personal computers or Colab cause considerable divergence. |
---|---|
ISSN: | 2590-1230 2590-1230 |
DOI: | 10.1016/j.rineng.2024.103218 |