Enhancing classroom teaching with LLMs and RAG
Large Language Models have become a valuable source of information for our daily inquiries. However, after training, its data source quickly becomes out-of-date, making RAG a useful tool for providing even more recent or pertinent data. In this work, we investigate how RAG pipelines, with the course...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2024-11 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Mullins, Elizabeth A Portillo, Adrian Ruiz-Rohena, Kristalys Piplai, Aritran |
description | Large Language Models have become a valuable source of information for our daily inquiries. However, after training, its data source quickly becomes out-of-date, making RAG a useful tool for providing even more recent or pertinent data. In this work, we investigate how RAG pipelines, with the course materials serving as a data source, might help students in K-12 education. The initial research utilizes Reddit as a data source for up-to-date cybersecurity information. Chunk size is evaluated to determine the optimal amount of context needed to generate accurate answers. After running the experiment for different chunk sizes, answer correctness was evaluated using RAGAs with average answer correctness not exceeding 50 percent for any chunk size. This suggests that Reddit is not a good source to mine for data for questions about cybersecurity threats. The methodology was successful in evaluating the data source, which has implications for its use to evaluate educational resources for effectiveness. |
format | Article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3126153549</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3126153549</sourcerecordid><originalsourceid>FETCH-proquest_journals_31261535493</originalsourceid><addsrcrecordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQc83LSMxLzsxLV0jOSSwuLsrPz1UoSU1MzgAJlWeWZCj4-PgWKyTmpSgEObrzMLCmJeYUp_JCaW4GZTfXEGcP3YKi_MLS1OKS-Kz80qI8oFS8saGRmaGpsamJpTFxqgCr0zGD</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3126153549</pqid></control><display><type>article</type><title>Enhancing classroom teaching with LLMs and RAG</title><source>Freely Accessible Journals</source><creator>Mullins, Elizabeth A ; Portillo, Adrian ; Ruiz-Rohena, Kristalys ; Piplai, Aritran</creator><creatorcontrib>Mullins, Elizabeth A ; Portillo, Adrian ; Ruiz-Rohena, Kristalys ; Piplai, Aritran</creatorcontrib><description>Large Language Models have become a valuable source of information for our daily inquiries. However, after training, its data source quickly becomes out-of-date, making RAG a useful tool for providing even more recent or pertinent data. In this work, we investigate how RAG pipelines, with the course materials serving as a data source, might help students in K-12 education. The initial research utilizes Reddit as a data source for up-to-date cybersecurity information. Chunk size is evaluated to determine the optimal amount of context needed to generate accurate answers. After running the experiment for different chunk sizes, answer correctness was evaluated using RAGAs with average answer correctness not exceeding 50 percent for any chunk size. This suggests that Reddit is not a good source to mine for data for questions about cybersecurity threats. The methodology was successful in evaluating the data source, which has implications for its use to evaluate educational resources for effectiveness.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Cybersecurity ; Data sources ; Large language models ; Threat evaluation</subject><ispartof>arXiv.org, 2024-11</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Mullins, Elizabeth A</creatorcontrib><creatorcontrib>Portillo, Adrian</creatorcontrib><creatorcontrib>Ruiz-Rohena, Kristalys</creatorcontrib><creatorcontrib>Piplai, Aritran</creatorcontrib><title>Enhancing classroom teaching with LLMs and RAG</title><title>arXiv.org</title><description>Large Language Models have become a valuable source of information for our daily inquiries. However, after training, its data source quickly becomes out-of-date, making RAG a useful tool for providing even more recent or pertinent data. In this work, we investigate how RAG pipelines, with the course materials serving as a data source, might help students in K-12 education. The initial research utilizes Reddit as a data source for up-to-date cybersecurity information. Chunk size is evaluated to determine the optimal amount of context needed to generate accurate answers. After running the experiment for different chunk sizes, answer correctness was evaluated using RAGAs with average answer correctness not exceeding 50 percent for any chunk size. This suggests that Reddit is not a good source to mine for data for questions about cybersecurity threats. The methodology was successful in evaluating the data source, which has implications for its use to evaluate educational resources for effectiveness.</description><subject>Cybersecurity</subject><subject>Data sources</subject><subject>Large language models</subject><subject>Threat evaluation</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNpjYuA0MjY21LUwMTLiYOAtLs4yMDAwMjM3MjU15mTQc83LSMxLzsxLV0jOSSwuLsrPz1UoSU1MzgAJlWeWZCj4-PgWKyTmpSgEObrzMLCmJeYUp_JCaW4GZTfXEGcP3YKi_MLS1OKS-Kz80qI8oFS8saGRmaGpsamJpTFxqgCr0zGD</recordid><startdate>20241107</startdate><enddate>20241107</enddate><creator>Mullins, Elizabeth A</creator><creator>Portillo, Adrian</creator><creator>Ruiz-Rohena, Kristalys</creator><creator>Piplai, Aritran</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241107</creationdate><title>Enhancing classroom teaching with LLMs and RAG</title><author>Mullins, Elizabeth A ; Portillo, Adrian ; Ruiz-Rohena, Kristalys ; Piplai, Aritran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31261535493</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Cybersecurity</topic><topic>Data sources</topic><topic>Large language models</topic><topic>Threat evaluation</topic><toplevel>online_resources</toplevel><creatorcontrib>Mullins, Elizabeth A</creatorcontrib><creatorcontrib>Portillo, Adrian</creatorcontrib><creatorcontrib>Ruiz-Rohena, Kristalys</creatorcontrib><creatorcontrib>Piplai, Aritran</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mullins, Elizabeth A</au><au>Portillo, Adrian</au><au>Ruiz-Rohena, Kristalys</au><au>Piplai, Aritran</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Enhancing classroom teaching with LLMs and RAG</atitle><jtitle>arXiv.org</jtitle><date>2024-11-07</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Large Language Models have become a valuable source of information for our daily inquiries. However, after training, its data source quickly becomes out-of-date, making RAG a useful tool for providing even more recent or pertinent data. In this work, we investigate how RAG pipelines, with the course materials serving as a data source, might help students in K-12 education. The initial research utilizes Reddit as a data source for up-to-date cybersecurity information. Chunk size is evaluated to determine the optimal amount of context needed to generate accurate answers. After running the experiment for different chunk sizes, answer correctness was evaluated using RAGAs with average answer correctness not exceeding 50 percent for any chunk size. This suggests that Reddit is not a good source to mine for data for questions about cybersecurity threats. The methodology was successful in evaluating the data source, which has implications for its use to evaluate educational resources for effectiveness.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-11 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_3126153549 |
source | Freely Accessible Journals |
subjects | Cybersecurity Data sources Large language models Threat evaluation |
title | Enhancing classroom teaching with LLMs and RAG |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T18%3A11%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Enhancing%20classroom%20teaching%20with%20LLMs%20and%20RAG&rft.jtitle=arXiv.org&rft.au=Mullins,%20Elizabeth%20A&rft.date=2024-11-07&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3126153549%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3126153549&rft_id=info:pmid/&rfr_iscdi=true |