SRL-ACO: A text augmentation framework based on semantic role labeling and ant colony optimization

The process of creating high-quality labeled data is crucial for training machine-learning models, but it can be a time-consuming and labor-intensive process. Moreover, manual annotation by human annotators can lead to varying degrees of competency, training, and experience, which can result in inco...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of King Saud University. Computer and information sciences 2023-07, Vol.35 (7), p.101611, Article 101611
1. Verfasser: Onan, Aytuğ
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The process of creating high-quality labeled data is crucial for training machine-learning models, but it can be a time-consuming and labor-intensive process. Moreover, manual annotation by human annotators can lead to varying degrees of competency, training, and experience, which can result in inconsistent labeling and arbitrary standards. To address these challenges, researchers have been exploring automated methods for enhancing training and testing datasets. This paper proposes SRL-ACO, a novel text augmentation framework that leverages Semantic Role Labeling (SRL) and Ant Colony Optimization (ACO) techniques to generate additional training data for natural language processing (NLP) models. The framework uses SRL to identify the semantic roles of words in a sentence and ACO to generate new sentences that preserve these roles. SRL-ACO can enhance the accuracy of NLP models by generating additional data without requiring manual data annotation. The paper presents experimental results demonstrating the effectiveness of SRL-ACO on seven text classification datasets for sentiment analysis, toxic text detection and sarcasm identification. The results show that SRL-ACO improves the performance of a classifier on different NLP tasks. These results demonstrate that SRL-ACO has the potential to enhance the quality and quantity of training data for various NLP tasks.
ISSN:1319-1578
2213-1248
DOI:10.1016/j.jksuci.2023.101611