Specialized Mathematical Solving by a Step-By-Step Expression Chain Generation

Math Solving requires both semantic understanding and relation reasoning. Most current approaches treat it as a translation task from natural language to mathematical symbols, generating tokens one by one. However, token-level generation is usually vulnerable when confronted with diverse annotations...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE/ACM transactions on audio, speech, and language processing speech, and language processing, 2024, Vol.32, p.3128-3140
Hauptverfasser:	Zhang, Wenqi, Shen, Yongliang, Hou, Guiyang, Wang, Kuangyi, Lu, Weiming
Format:	Artikel
Sprache:	eng
Schlagworte:	Annotations Cognition Collaboration Decoding expression generation Labeling Large language models math reasoning Math word problem Mathematical models Natural language processing Nodes Reasoning step-by-step Task analysis
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Math Solving requires both semantic understanding and relation reasoning. Most current approaches treat it as a translation task from natural language to mathematical symbols, generating tokens one by one. However, token-level generation is usually vulnerable when confronted with diverse annotations and complex reasoning. We consider the equation is an ordered combination of multiple sub-expressions, and math reasoning should be performed on the sub-expression level rather than the token level. We treat sub-expression as a minimum generative unit and minimum reasoning node. At each step, candidate sub-expression nodes are generated in parallel, and the whole reasoning chain is deduced by combining multiple nodes in order. Besides, we can obtain multiple valid reasoning chains by sub-expression searching, further improving interpretability and precision. Experiments on multilingual datasets show our method significantly outperforms the baselines. Additionally, our approach is more stable and efficient when faced with the challenges of diverse annotation and complex reasoning with limited resources. Moreover, our approach can be seamlessly integrated with large language models (LLMs), enhancing LLM's mathematical reasoning capabilities at a minimal cost. Experiments show a synergistic collaboration between general-purpose LLM and our specialized model yields superior performance.
ISSN:	2329-9290 2329-9304
DOI:	10.1109/TASLP.2024.3410028