Recognition of gasoline in fire debris using machine learning: Part I, application of random forest, gradient boosting, support vector machine, and naïve bayes
•Machine learning algorithms are developed to recognize gasoline in fire debris samples.•Four methods, including random forest, gradient boosting, support vector machine, and naïve bayes are applied and compared.•The training and validation dataset consists of fire debris samples with and without ga...
Gespeichert in:
Veröffentlicht in: | Forensic science international 2022-02, Vol.331, p.111146-111146, Article 111146 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | •Machine learning algorithms are developed to recognize gasoline in fire debris samples.•Four methods, including random forest, gradient boosting, support vector machine, and naïve bayes are applied and compared.•The training and validation dataset consists of fire debris samples with and without gasoline.•The samples with gasoline included fire debris spiked with weathered gasoline (up to 99.6 %).•Three of the methods succeed to classify all test samples correctly without any false positive or false negative allocation.
The detection and identification of ignitable liquid (IL) residues in fire debris are two very challenging tasks in a fire investigation. To this day, the recognition of IL in fire debris includes the chemical analysis of the fire debris composition, followed by the examination and interpretation of the analysis result by a trained forensic examiner. Throughout the last decade, chemometrics and artificial intelligence have become increasingly important. In the present study, machine learning algorithms capable of recognizing gasoline residues in fire debris based on GC-MS data have been developed. Four methods, including random forest, gradient boosting, support vector machine, and naïve bayes are applied and used to classify fire debris samples into the two categories “with gasoline” or “without gasoline”. A fifth method (logistic regression) did not converge due to well separated classes. A database comprising 360 measurements, including fire debris samples of real cases as well as fire debris samples spiked with known amounts of weathered gasoline (up to 99.6%), was available to train the machine learning algorithms (using 85% of the data) and to subsequently test the performance of the methods when classifying unknown samples (using 15% of the data). In general, the methods perform very well, as three of it succeeded to classify all test samples correctly without any false positive or false negative allocations. One (naïve bayes) was not trained enough to classify other (non-gasoline) IL correctly as “no gasoline”. Furthermore, the random forest method reveals which chemical compounds are most relevant for the algorithm to classify the samples. In general, the presented approach is highly promising and could easily be extended or adapted to other types of IL. Similar to the neural network presented in the accompanying paper, such methods have the potential to serve as a fast screening technique for fire debris samples, thus supporting |
---|---|
ISSN: | 0379-0738 1872-6283 |
DOI: | 10.1016/j.forsciint.2021.111146 |