PersianRAG: A Retrieval-Augmented Generation System for Persian Language
Retrieval augmented generation (RAG) models, which integrate large-scale pre-trained generative models with external retrieval mechanisms, have shown significant success in various natural language processing (NLP) tasks. However, applying RAG models in Persian language as a low-resource language, p...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Retrieval augmented generation (RAG) models, which integrate large-scale
pre-trained generative models with external retrieval mechanisms, have shown
significant success in various natural language processing (NLP) tasks.
However, applying RAG models in Persian language as a low-resource language,
poses distinct challenges. These challenges primarily involve the
preprocessing, embedding, retrieval, prompt construction, language modeling,
and response evaluation of the system. In this paper, we address the challenges
towards implementing a real-world RAG system for Persian language called
PersianRAG. We propose novel solutions to overcome these obstacles and evaluate
our approach using several Persian benchmark datasets. Our experimental results
demonstrate the capability of the PersianRAG framework to enhance question
answering task in Persian. |
---|---|
DOI: | 10.48550/arxiv.2411.02832 |