Enhancing classroom teaching with LLMs and RAG

Large Language Models have become a valuable source of information for our daily inquiries. However, after training, its data source quickly becomes out-of-date, making RAG a useful tool for providing even more recent or pertinent data. In this work, we investigate how RAG pipelines, with the course...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Mullins, Elizabeth A, Portillo, Adrian, Ruiz-Rohena, Kristalys, Piplai, Aritran
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Mullins, Elizabeth A
Portillo, Adrian
Ruiz-Rohena, Kristalys
Piplai, Aritran
description Large Language Models have become a valuable source of information for our daily inquiries. However, after training, its data source quickly becomes out-of-date, making RAG a useful tool for providing even more recent or pertinent data. In this work, we investigate how RAG pipelines, with the course materials serving as a data source, might help students in K-12 education. The initial research utilizes Reddit as a data source for up-to-date cybersecurity information. Chunk size is evaluated to determine the optimal amount of context needed to generate accurate answers. After running the experiment for different chunk sizes, answer correctness was evaluated using RAGAs with average answer correctness not exceeding 50 percent for any chunk size. This suggests that Reddit is not a good source to mine for data for questions about cybersecurity threats. The methodology was successful in evaluating the data source, which has implications for its use to evaluate educational resources for effectiveness.
doi_str_mv 10.48550/arxiv.2411.04341
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_04341</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_04341</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_043413</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DMwMTYx5GTQc83LSMxLzsxLV0jOSSwuLsrPz1UoSU1MzgAJlWeWZCj4-PgWKyTmpSgEObrzMLCmJeYUp_JCaW4GeTfXEGcPXbDJ8QVFmbmJRZXxIBviwTYYE1YBAJL4LyA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Enhancing classroom teaching with LLMs and RAG</title><source>arXiv.org</source><creator>Mullins, Elizabeth A ; Portillo, Adrian ; Ruiz-Rohena, Kristalys ; Piplai, Aritran</creator><creatorcontrib>Mullins, Elizabeth A ; Portillo, Adrian ; Ruiz-Rohena, Kristalys ; Piplai, Aritran</creatorcontrib><description>Large Language Models have become a valuable source of information for our daily inquiries. However, after training, its data source quickly becomes out-of-date, making RAG a useful tool for providing even more recent or pertinent data. In this work, we investigate how RAG pipelines, with the course materials serving as a data source, might help students in K-12 education. The initial research utilizes Reddit as a data source for up-to-date cybersecurity information. Chunk size is evaluated to determine the optimal amount of context needed to generate accurate answers. After running the experiment for different chunk sizes, answer correctness was evaluated using RAGAs with average answer correctness not exceeding 50 percent for any chunk size. This suggests that Reddit is not a good source to mine for data for questions about cybersecurity threats. The methodology was successful in evaluating the data source, which has implications for its use to evaluate educational resources for effectiveness.</description><identifier>DOI: 10.48550/arxiv.2411.04341</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2024-11</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.04341$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.04341$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Mullins, Elizabeth A</creatorcontrib><creatorcontrib>Portillo, Adrian</creatorcontrib><creatorcontrib>Ruiz-Rohena, Kristalys</creatorcontrib><creatorcontrib>Piplai, Aritran</creatorcontrib><title>Enhancing classroom teaching with LLMs and RAG</title><description>Large Language Models have become a valuable source of information for our daily inquiries. However, after training, its data source quickly becomes out-of-date, making RAG a useful tool for providing even more recent or pertinent data. In this work, we investigate how RAG pipelines, with the course materials serving as a data source, might help students in K-12 education. The initial research utilizes Reddit as a data source for up-to-date cybersecurity information. Chunk size is evaluated to determine the optimal amount of context needed to generate accurate answers. After running the experiment for different chunk sizes, answer correctness was evaluated using RAGAs with average answer correctness not exceeding 50 percent for any chunk size. This suggests that Reddit is not a good source to mine for data for questions about cybersecurity threats. The methodology was successful in evaluating the data source, which has implications for its use to evaluate educational resources for effectiveness.</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DMwMTYx5GTQc83LSMxLzsxLV0jOSSwuLsrPz1UoSU1MzgAJlWeWZCj4-PgWKyTmpSgEObrzMLCmJeYUp_JCaW4GeTfXEGcPXbDJ8QVFmbmJRZXxIBviwTYYE1YBAJL4LyA</recordid><startdate>20241106</startdate><enddate>20241106</enddate><creator>Mullins, Elizabeth A</creator><creator>Portillo, Adrian</creator><creator>Ruiz-Rohena, Kristalys</creator><creator>Piplai, Aritran</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241106</creationdate><title>Enhancing classroom teaching with LLMs and RAG</title><author>Mullins, Elizabeth A ; Portillo, Adrian ; Ruiz-Rohena, Kristalys ; Piplai, Aritran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_043413</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Mullins, Elizabeth A</creatorcontrib><creatorcontrib>Portillo, Adrian</creatorcontrib><creatorcontrib>Ruiz-Rohena, Kristalys</creatorcontrib><creatorcontrib>Piplai, Aritran</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mullins, Elizabeth A</au><au>Portillo, Adrian</au><au>Ruiz-Rohena, Kristalys</au><au>Piplai, Aritran</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Enhancing classroom teaching with LLMs and RAG</atitle><date>2024-11-06</date><risdate>2024</risdate><abstract>Large Language Models have become a valuable source of information for our daily inquiries. However, after training, its data source quickly becomes out-of-date, making RAG a useful tool for providing even more recent or pertinent data. In this work, we investigate how RAG pipelines, with the course materials serving as a data source, might help students in K-12 education. The initial research utilizes Reddit as a data source for up-to-date cybersecurity information. Chunk size is evaluated to determine the optimal amount of context needed to generate accurate answers. After running the experiment for different chunk sizes, answer correctness was evaluated using RAGAs with average answer correctness not exceeding 50 percent for any chunk size. This suggests that Reddit is not a good source to mine for data for questions about cybersecurity threats. The methodology was successful in evaluating the data source, which has implications for its use to evaluate educational resources for effectiveness.</abstract><doi>10.48550/arxiv.2411.04341</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2411.04341
ispartof
issn
language eng
recordid cdi_arxiv_primary_2411_04341
source arXiv.org
subjects Computer Science - Learning
title Enhancing classroom teaching with LLMs and RAG
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T03%3A18%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Enhancing%20classroom%20teaching%20with%20LLMs%20and%20RAG&rft.au=Mullins,%20Elizabeth%20A&rft.date=2024-11-06&rft_id=info:doi/10.48550/arxiv.2411.04341&rft_dat=%3Carxiv_GOX%3E2411_04341%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true