Discriminating the origin of fish from closely related water bodies by combining NMR spectroscopy with statistical analysis and machine learning
Pikeperch, perch and bream are among the most traded and valued fish species in North-Eastern Europe. Therefore, it is necessary to be able to distinguish fish from different lakes and coastal sea regions to ensure a good traceability of products in the fish market and to protect both consumers and...
Gespeichert in:
Veröffentlicht in: | Ecological informatics 2024-11, Vol.83, p.102753, Article 102753 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Pikeperch, perch and bream are among the most traded and valued fish species in North-Eastern Europe. Therefore, it is necessary to be able to distinguish fish from different lakes and coastal sea regions to ensure a good traceability of products in the fish market and to protect both consumers and fish stocks. Untargeted metabolomics using nuclear magnetic resonance (NMR) spectroscopy is a suitable tool for this purpose. It is an established method for determining various properties of biological and living systems, such as health, origin, type, etc. Statistical methods including principal component analysis (PCA) and linear discriminant analysis (LDA) are typically applied to NMR data to correlate spectra with a particular research question.
Herein we examine fish from three closely related water bodies and demonstrate that reliable determination of the water body that a particular fish originates from by traditional statistical analysis (PCA and LDA) of fish NMR spectra is not possible. In contrast, determining the fish species is possible. We proceed to show that machine learning methods perform better and that a combination of statistical analysis (LDA) and random forest (RF), a supervised machine learning technique, allows reliable determination of the originating water body, while being also tolerant to seasonal variations. This is an improvement over prior work, which has dealt with more clearly distinguished origins of fish. Exceptional accuracy was achieved in correctly assigning fish to their origin even in a scenario where two of the water bodies are connected by a river through which the fish are known to migrate. Since determining the origin of fish is important in environmental protection, we recommend following up this approach and using it as the basis of a robust tool for environmental protection and other monitoring purposes.
[Display omitted]
•NMR spectroscopy is an excellent tool for metabolic profiling of fish samples.•Statistical analysis clusters NMR spectra according to species and water body.•Statistical analysis lacks predictive power and cannot classify unknown samples.•Incorporating machine learning classifies new samples with 100 % accuracy.•The origin is correctly determined even for fish from interconnected water bodies. |
---|---|
ISSN: | 1574-9541 |
DOI: | 10.1016/j.ecoinf.2024.102753 |