Fine-Tuned T5 Transformer with LSTM and Spider Monkey Optimizer for Redundancy Reduction in Automatic Question Generation
The significance of Automatic Question Generation (AQG) lies in its potential to support educators and streamline assessment processes. Notable improvements in AQG are seen with the use of language models, ranging from LSTMs to Transformers. However, there is a need for improvement in the probabilis...
Gespeichert in:
Veröffentlicht in: | SN computer science 2024-06, Vol.5 (5), p.475, Article 475 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The significance of Automatic Question Generation (AQG) lies in its potential to support educators and streamline assessment processes. Notable improvements in AQG are seen with the use of language models, ranging from LSTMs to Transformers. However, there is a need for improvement in the probabilistic scoring technique employed for next word generation in the target question. In this regard, it is noted that template-based methods offer potential for enhancement, although they may result in the generation of fewer or redundant questions due to the utilization of fixed templates. This research aims to address this gap by proposing a hybrid model that combines the the advantages of template-based and Transformer-based AQG approaches. The template-based LSTM approach is explored to learn adaptable question templates. On the other side, the Transformer model is explored to reduce redundancy in the auto-generated questions. The proposed work finetunes the pipelined T5 Transformer model using the Spider Monkey Optimizer over the LSTM-generated templates. The choice of Spider Monkey Optimizer enhances the selection of the named entity in question tail (tail entity) through dynamic sub-search space division for efficient exploration and exploitation, and self-organization based on local and global scoring. This ensures that the named entity in the question tail is non-redundant (diverse) and regards both structural and contextual coherence to the auto-generated question. Experimental findings highlight improvements in how well the diversely selected named entities are relevant to the generated questions through higher precision, recall and f1-scores in the pipelining phase. Moreover, the study shows that the Spider Monkey Optimizer performs better in selecting tail entities, and it consistently outperforms other algorithms in F1-score and convergence time across all datasets, with its time complexity increasing linearly as dataset size grows. The finetuned pipelined T5 model (proposed model) exhibits improved ROUGE scores over baselines with reduced computational overhead and shorter inference time in the generative phase across datasets in linear convergence time. |
---|---|
ISSN: | 2661-8907 2662-995X 2661-8907 |
DOI: | 10.1007/s42979-024-02826-0 |