Do the Math: Making Mathematics in Wikipedia Computable

Wikipedia combines the power of AI solutions and human reviewers to safeguard article quality. Quality control objectives include detecting malicious edits, fixing typos, and spotting inconsistent formatting. However, no automated quality control mechanisms currently exist for mathematical formulae....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on pattern analysis and machine intelligence 2023-04, Vol.45 (4), p.4384-4395
Hauptverfasser: Greiner-Petter, Andre, Schubotz, Moritz, Breitinger, Corinna, Scharpf, Philipp, Aizawa, Akiko, Gipp, Bela
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Wikipedia combines the power of AI solutions and human reviewers to safeguard article quality. Quality control objectives include detecting malicious edits, fixing typos, and spotting inconsistent formatting. However, no automated quality control mechanisms currently exist for mathematical formulae. Spell checkers are widely used to highlight textual errors, yet no equivalent tool exists to detect algebraically incorrect formulae. Our paper addresses this shortcoming by making mathematical formulae computable. We present a method that (1) gathers the semantic information surrounding the context of each mathematical formulae, (2) provides access to the information in a graph-structured dependency hierarchy, and (3) performs automatic plausibility checks on equations. We evaluate the performance of our approach on 6,337 mathematical expressions contained in 104 Wikipedia articles on the topic of orthogonal polynomials and special functions. Our system, \text{L}{A}\text{C}{\scriptsize{\text{AS}}}\text{T} LACAST , verified 358 out of 1,516 equations as error-free. \text{L}{A}\text{C}{\scriptsize\text{AS}}\text{T} LACAST successfully translated 27% of the mathematical expressions and outperformed existing translation approaches by 16%. Additionally, \text{L}{A}\text{C}{\scriptsize\text{AS}}\text{T} LACAST achieved an F1 score of .495 for annotating mathematical expressions with relevant textual descriptions, which is a significant step towards advancing searchability, readability, and accessibility of mathematical formulae in Wikipedia. A prototype of \text{L}{A}\text{C}{\scriptsize\t
ISSN:0162-8828
1939-3539
2160-9292
DOI:10.1109/TPAMI.2022.3195261