MATHWELL: Generating Educational Math Word Problems Using Teacher Annotations

Math word problems are critical K-8 educational tools, but writing them is time consuming and requires extensive expertise. To be educational, problems must be solvable, have accurate answers, and, most importantly, be educationally appropriate. We propose that language models have potential to supp...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Christ, Bryan R, Kropko, Jonathan, Hartvigsen, Thomas
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Christ, Bryan R Kropko, Jonathan Hartvigsen, Thomas
description	Math word problems are critical K-8 educational tools, but writing them is time consuming and requires extensive expertise. To be educational, problems must be solvable, have accurate answers, and, most importantly, be educationally appropriate. We propose that language models have potential to support K-8 math education by automatically generating word problems. However, evaluating educational appropriateness is hard to quantify. We fill this gap by having teachers evaluate problems generated by LLMs, who find existing models and data often fail to be educationally appropriate. We then explore automatically generating educational word problems, ultimately using our expert annotations to finetune a 70B language model. Our model, MATHWELL, is the first K-8 word problem generator targeted at educational appropriateness. Further expert studies find MATHWELL generates problems far more solvable, accurate, and appropriate than public models. MATHWELL also matches GPT-4's problem quality while attaining more appropriate reading levels for K-8 students and avoiding generating harmful questions.
doi_str_mv	10.48550/arxiv.2402.15861
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2402_15861</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2402_15861</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2402_158613</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjEw0jM0tTAz5GTw9XUM8Qh39fGxUnBPzUstSizJzEtXcE0pTQay8vMScxR8E0syFMLzi1IUAoryk3JSc4sVQotBikJSE5MzUosUHPPy8kvAqot5GFjTEnOKU3mhNDeDvJtriLOHLtji-IKizNzEosp4kAPiwQ4wJqwCAJ_AOpc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>MATHWELL: Generating Educational Math Word Problems Using Teacher Annotations</title><source>arXiv.org</source><creator>Christ, Bryan R ; Kropko, Jonathan ; Hartvigsen, Thomas</creator><creatorcontrib>Christ, Bryan R ; Kropko, Jonathan ; Hartvigsen, Thomas</creatorcontrib><description>Math word problems are critical K-8 educational tools, but writing them is time consuming and requires extensive expertise. To be educational, problems must be solvable, have accurate answers, and, most importantly, be educationally appropriate. We propose that language models have potential to support K-8 math education by automatically generating word problems. However, evaluating educational appropriateness is hard to quantify. We fill this gap by having teachers evaluate problems generated by LLMs, who find existing models and data often fail to be educationally appropriate. We then explore automatically generating educational word problems, ultimately using our expert annotations to finetune a 70B language model. Our model, MATHWELL, is the first K-8 word problem generator targeted at educational appropriateness. Further expert studies find MATHWELL generates problems far more solvable, accurate, and appropriate than public models. MATHWELL also matches GPT-4's problem quality while attaining more appropriate reading levels for K-8 students and avoiding generating harmful questions.</description><identifier>DOI: 10.48550/arxiv.2402.15861</identifier><language>eng</language><subject>Computer Science - Computation and Language</subject><creationdate>2024-02</creationdate><rights>http://creativecommons.org/licenses/by-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2402.15861$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2402.15861$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Christ, Bryan R</creatorcontrib><creatorcontrib>Kropko, Jonathan</creatorcontrib><creatorcontrib>Hartvigsen, Thomas</creatorcontrib><title>MATHWELL: Generating Educational Math Word Problems Using Teacher Annotations</title><description>Math word problems are critical K-8 educational tools, but writing them is time consuming and requires extensive expertise. To be educational, problems must be solvable, have accurate answers, and, most importantly, be educationally appropriate. We propose that language models have potential to support K-8 math education by automatically generating word problems. However, evaluating educational appropriateness is hard to quantify. We fill this gap by having teachers evaluate problems generated by LLMs, who find existing models and data often fail to be educationally appropriate. We then explore automatically generating educational word problems, ultimately using our expert annotations to finetune a 70B language model. Our model, MATHWELL, is the first K-8 word problem generator targeted at educational appropriateness. Further expert studies find MATHWELL generates problems far more solvable, accurate, and appropriate than public models. MATHWELL also matches GPT-4's problem quality while attaining more appropriate reading levels for K-8 students and avoiding generating harmful questions.</description><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjEw0jM0tTAz5GTw9XUM8Qh39fGxUnBPzUstSizJzEtXcE0pTQay8vMScxR8E0syFMLzi1IUAoryk3JSc4sVQotBikJSE5MzUosUHPPy8kvAqot5GFjTEnOKU3mhNDeDvJtriLOHLtji-IKizNzEosp4kAPiwQ4wJqwCAJ_AOpc</recordid><startdate>20240224</startdate><enddate>20240224</enddate><creator>Christ, Bryan R</creator><creator>Kropko, Jonathan</creator><creator>Hartvigsen, Thomas</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240224</creationdate><title>MATHWELL: Generating Educational Math Word Problems Using Teacher Annotations</title><author>Christ, Bryan R ; Kropko, Jonathan ; Hartvigsen, Thomas</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2402_158613</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Christ, Bryan R</creatorcontrib><creatorcontrib>Kropko, Jonathan</creatorcontrib><creatorcontrib>Hartvigsen, Thomas</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Christ, Bryan R</au><au>Kropko, Jonathan</au><au>Hartvigsen, Thomas</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MATHWELL: Generating Educational Math Word Problems Using Teacher Annotations</atitle><date>2024-02-24</date><risdate>2024</risdate><abstract>Math word problems are critical K-8 educational tools, but writing them is time consuming and requires extensive expertise. To be educational, problems must be solvable, have accurate answers, and, most importantly, be educationally appropriate. We propose that language models have potential to support K-8 math education by automatically generating word problems. However, evaluating educational appropriateness is hard to quantify. We fill this gap by having teachers evaluate problems generated by LLMs, who find existing models and data often fail to be educationally appropriate. We then explore automatically generating educational word problems, ultimately using our expert annotations to finetune a 70B language model. Our model, MATHWELL, is the first K-8 word problem generator targeted at educational appropriateness. Further expert studies find MATHWELL generates problems far more solvable, accurate, and appropriate than public models. MATHWELL also matches GPT-4's problem quality while attaining more appropriate reading levels for K-8 students and avoiding generating harmful questions.</abstract><doi>10.48550/arxiv.2402.15861</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2402.15861
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2402_15861
source	arXiv.org
subjects	Computer Science - Computation and Language
title	MATHWELL: Generating Educational Math Word Problems Using Teacher Annotations
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T20%3A13%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MATHWELL:%20Generating%20Educational%20Math%20Word%20Problems%20Using%20Teacher%20Annotations&rft.au=Christ,%20Bryan%20R&rft.date=2024-02-24&rft_id=info:doi/10.48550/arxiv.2402.15861&rft_dat=%3Carxiv_GOX%3E2402_15861%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true