Enhancing classroom teaching with LLMs and RAG
Large Language Models have become a valuable source of information for our daily inquiries. However, after training, its data source quickly becomes out-of-date, making RAG a useful tool for providing even more recent or pertinent data. In this work, we investigate how RAG pipelines, with the course...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Mullins, Elizabeth A Portillo, Adrian Ruiz-Rohena, Kristalys Piplai, Aritran |
description | Large Language Models have become a valuable source of information for our
daily inquiries. However, after training, its data source quickly becomes
out-of-date, making RAG a useful tool for providing even more recent or
pertinent data. In this work, we investigate how RAG pipelines, with the course
materials serving as a data source, might help students in K-12 education. The
initial research utilizes Reddit as a data source for up-to-date cybersecurity
information. Chunk size is evaluated to determine the optimal amount of context
needed to generate accurate answers. After running the experiment for different
chunk sizes, answer correctness was evaluated using RAGAs with average answer
correctness not exceeding 50 percent for any chunk size. This suggests that
Reddit is not a good source to mine for data for questions about cybersecurity
threats. The methodology was successful in evaluating the data source, which
has implications for its use to evaluate educational resources for
effectiveness. |
doi_str_mv | 10.48550/arxiv.2411.04341 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2411_04341</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2411_04341</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2411_043413</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DMwMTYx5GTQc83LSMxLzsxLV0jOSSwuLsrPz1UoSU1MzgAJlWeWZCj4-PgWKyTmpSgEObrzMLCmJeYUp_JCaW4GeTfXEGcPXbDJ8QVFmbmJRZXxIBviwTYYE1YBAJL4LyA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Enhancing classroom teaching with LLMs and RAG</title><source>arXiv.org</source><creator>Mullins, Elizabeth A ; Portillo, Adrian ; Ruiz-Rohena, Kristalys ; Piplai, Aritran</creator><creatorcontrib>Mullins, Elizabeth A ; Portillo, Adrian ; Ruiz-Rohena, Kristalys ; Piplai, Aritran</creatorcontrib><description>Large Language Models have become a valuable source of information for our
daily inquiries. However, after training, its data source quickly becomes
out-of-date, making RAG a useful tool for providing even more recent or
pertinent data. In this work, we investigate how RAG pipelines, with the course
materials serving as a data source, might help students in K-12 education. The
initial research utilizes Reddit as a data source for up-to-date cybersecurity
information. Chunk size is evaluated to determine the optimal amount of context
needed to generate accurate answers. After running the experiment for different
chunk sizes, answer correctness was evaluated using RAGAs with average answer
correctness not exceeding 50 percent for any chunk size. This suggests that
Reddit is not a good source to mine for data for questions about cybersecurity
threats. The methodology was successful in evaluating the data source, which
has implications for its use to evaluate educational resources for
effectiveness.</description><identifier>DOI: 10.48550/arxiv.2411.04341</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2024-11</creationdate><rights>http://creativecommons.org/licenses/by-nc-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2411.04341$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2411.04341$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Mullins, Elizabeth A</creatorcontrib><creatorcontrib>Portillo, Adrian</creatorcontrib><creatorcontrib>Ruiz-Rohena, Kristalys</creatorcontrib><creatorcontrib>Piplai, Aritran</creatorcontrib><title>Enhancing classroom teaching with LLMs and RAG</title><description>Large Language Models have become a valuable source of information for our
daily inquiries. However, after training, its data source quickly becomes
out-of-date, making RAG a useful tool for providing even more recent or
pertinent data. In this work, we investigate how RAG pipelines, with the course
materials serving as a data source, might help students in K-12 education. The
initial research utilizes Reddit as a data source for up-to-date cybersecurity
information. Chunk size is evaluated to determine the optimal amount of context
needed to generate accurate answers. After running the experiment for different
chunk sizes, answer correctness was evaluated using RAGAs with average answer
correctness not exceeding 50 percent for any chunk size. This suggests that
Reddit is not a good source to mine for data for questions about cybersecurity
threats. The methodology was successful in evaluating the data source, which
has implications for its use to evaluate educational resources for
effectiveness.</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjE01DMwMTYx5GTQc83LSMxLzsxLV0jOSSwuLsrPz1UoSU1MzgAJlWeWZCj4-PgWKyTmpSgEObrzMLCmJeYUp_JCaW4GeTfXEGcPXbDJ8QVFmbmJRZXxIBviwTYYE1YBAJL4LyA</recordid><startdate>20241106</startdate><enddate>20241106</enddate><creator>Mullins, Elizabeth A</creator><creator>Portillo, Adrian</creator><creator>Ruiz-Rohena, Kristalys</creator><creator>Piplai, Aritran</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241106</creationdate><title>Enhancing classroom teaching with LLMs and RAG</title><author>Mullins, Elizabeth A ; Portillo, Adrian ; Ruiz-Rohena, Kristalys ; Piplai, Aritran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2411_043413</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Mullins, Elizabeth A</creatorcontrib><creatorcontrib>Portillo, Adrian</creatorcontrib><creatorcontrib>Ruiz-Rohena, Kristalys</creatorcontrib><creatorcontrib>Piplai, Aritran</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mullins, Elizabeth A</au><au>Portillo, Adrian</au><au>Ruiz-Rohena, Kristalys</au><au>Piplai, Aritran</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Enhancing classroom teaching with LLMs and RAG</atitle><date>2024-11-06</date><risdate>2024</risdate><abstract>Large Language Models have become a valuable source of information for our
daily inquiries. However, after training, its data source quickly becomes
out-of-date, making RAG a useful tool for providing even more recent or
pertinent data. In this work, we investigate how RAG pipelines, with the course
materials serving as a data source, might help students in K-12 education. The
initial research utilizes Reddit as a data source for up-to-date cybersecurity
information. Chunk size is evaluated to determine the optimal amount of context
needed to generate accurate answers. After running the experiment for different
chunk sizes, answer correctness was evaluated using RAGAs with average answer
correctness not exceeding 50 percent for any chunk size. This suggests that
Reddit is not a good source to mine for data for questions about cybersecurity
threats. The methodology was successful in evaluating the data source, which
has implications for its use to evaluate educational resources for
effectiveness.</abstract><doi>10.48550/arxiv.2411.04341</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2411.04341 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2411_04341 |
source | arXiv.org |
subjects | Computer Science - Learning |
title | Enhancing classroom teaching with LLMs and RAG |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T03%3A18%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Enhancing%20classroom%20teaching%20with%20LLMs%20and%20RAG&rft.au=Mullins,%20Elizabeth%20A&rft.date=2024-11-06&rft_id=info:doi/10.48550/arxiv.2411.04341&rft_dat=%3Carxiv_GOX%3E2411_04341%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |