Estimating Rock Composition from Replicate Geochemical Analyses: Theory and Application to Magmatic Rocks of the GeoPT Database

Chemical analyses of powdered rocks by different laboratories often yield varying results, requiring estimation of the rock’s true composition and associated uncertainty. Challenges arise from the peculiar nature of geochemical data. Traditionally, major and trace elements have been measured using d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Mathematical geosciences 2024-10, Vol.56 (7), p.1539-1604
Hauptverfasser: De Greef, Maxime Keutgen, Weltje, Gert Jan, Gijbels, Irène
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Chemical analyses of powdered rocks by different laboratories often yield varying results, requiring estimation of the rock’s true composition and associated uncertainty. Challenges arise from the peculiar nature of geochemical data. Traditionally, major and trace elements have been measured using different methods, resulting in chemical analyses where the sum of the parts fluctuates around 1 rather than precisely totaling 1. Additionally, all chemical analyses contain an undisclosed mass fraction representing undetected chemical elements. Because of this undisclosed and unknown mass fraction, geochemical data represent a particular kind of compositional data in which closure to unity is not guaranteed. We argue that chemical analyses exist in the hypercube while being sampled from a true composition residing in the simplex. Therefore, we propose an algorithm that generates random chemical analyses by simulating the data acquisition protocol in geochemistry. Using the algorithm’s output, we measure the bias and mean squared error (MSE) of various estimators of the true mean composition. Additionally, we explore the impact of missing values on estimator performance. Our findings reveal that the optimized binary log-ratio mean, a new estimator, exhibits the lowest MSE and bias. It performs well even with up to 70% missing values, in contrast to other classical estimators such as the arithmetic mean or the geometric mean. Applying our approach to the GeoPT database, which contains replicate analyses of igneous rocks from numerous geochemical laboratories, we introduce an outlier detection technique based on the Mahalanobis distance between a laboratory’s logit coordinates and the optimized mean estimate. This enables a probabilistic ranking of laboratories based on the atypicality of their performance. Finally, we offer an accessible R implementation of our findings through the GitHub repository linked to this paper [subject classification numbers: 10 (compositions) 85 (statistics)].
ISSN:1874-8961
1874-8953
DOI:10.1007/s11004-024-10138-5