Transformer based Answer-Aware Bengali Question Generation
•Explored answer-aware Question generation (QG) text generation for a low-resource language, Bengali.•Trained multilingual and monolingual T5 models to generate questions.•Compared the impact of the various decoding algorithms while generating questions.•Conducted extensive error analysis and human...
Gespeichert in:
Veröffentlicht in: | International journal of cognitive computing in engineering 2023-06, Vol.4, p.314-326 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Explored answer-aware Question generation (QG) text generation for a low-resource language, Bengali.•Trained multilingual and monolingual T5 models to generate questions.•Compared the impact of the various decoding algorithms while generating questions.•Conducted extensive error analysis and human evaluation for question generations.•Developed an online question generation user interface with a combination of models and their hyperparameters.
Question generation (QG), the task of generating questions from text or other forms of data, a significant and challenging subject, has recently attracted more attention in natural language processing (NLP) due to its vast range of business, healthcare, and education applications through creating quizzes, Frequently Asked Questions (FAQs) and documentation. Most QG research has been conducted in languages with abundant resources, such as English. However, due to the dearth of training data in low-resource languages, such as Bengali, thorough research on Bengali question generation has yet to be conducted. In this article, we propose a system for producing varied and pertinent Bengali questions from context passages in natural language in an answer-aware input format using a series of fine-tuned text-to-text transformer (T5) based models. During our studies with various transformer-based encoder-decoder models and various decoding processes, along with delivering 98% grammatically accurate questions, our fine-tuned BanglaT5 model had the highest 35.77 F-score in RougeL and 38.57 BLEU-1 score with beam search. Our automated and human evaluation results show that our answer-aware QG models can create realistic, human-like questions relevant to the context passage and answer. We also release our code, generated questions, dataset, and models to enable broader question generation research for the Bengali-speaking community. |
---|---|
ISSN: | 2666-3074 2666-3074 |
DOI: | 10.1016/j.ijcce.2023.09.003 |