Cellphone picture-based, genus-level automated identification of Chagas disease vectors: Effects of picture orientation on the performance of five machine-learning algorithms
Chagas disease (CD) is a public-health concern across Latin America. It is caused by Trypanosoma cruzi, a parasite transmitted by blood-sucking triatomine bugs. Automated identification of triatomine bugs is a potential means to strengthen CD vector surveillance. To be broadly useful, however, autom...
Gespeichert in:
Veröffentlicht in: | Ecological informatics 2024-03, Vol.79, p.102430, Article 102430 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Chagas disease (CD) is a public-health concern across Latin America. It is caused by Trypanosoma cruzi, a parasite transmitted by blood-sucking triatomine bugs. Automated identification of triatomine bugs is a potential means to strengthen CD vector surveillance. To be broadly useful, however, automated systems must draw on algorithms capable of correctly identifying bugs from images taken with ordinary cellphone cameras at varying angles or positions. Here, we assess the performance of five machine-learning algorithms at identifying the main CD vector genera (Triatoma, Panstrongylus, and Rhodnius) based on bugs photographed at different angles/positions with a 72-dpi cellphone camera. Each bug (N = 730; 13 species) was photographed at nine angles representing three positions: dorsal-flat, dorsal-oblique, and front/back-oblique. We randomly split the 6570-picture database into training (80%) and testing sets (20%), and then trained and tested a convolutional neural network (AlexNet, AN); three boosting-based classifiers (AdaBoost, AB; Gradient Boosting, GB; and Histogram-based Gradient Boosting, HB); and a linear discriminant model (LD). We assessed identification accuracy and specificity with logit-binomial generalized linear mixed models fit in a Bayesian framework. Differences in performance across algorithms were mainly driven by AN's essentially perfect accuracy and specificity, irrespective of picture angle or bug position. HB predicted accuracies ranged from ∼0.987 (Panstrongylus, dorsal-oblique) to >0.999 (Triatoma, dorsal-flat). AB accuracy was poor for Rhodnius (∼0.224–0.282) and Panstrongylus (∼0.664–0.729), but high for Triatoma (∼0.988–0.991). For Panstrongylus, LD and GB had predicted accuracies in the ∼0.970–0.984 range. AB misclassified ∼57% of Rhodnius and Panstrongylus as Triatoma, whereas specificity ranged from ∼0.92 to ∼1.0 for the remaining algorithm-genus combinations. Dorsal-flat pictures appeared to improve algorithm performance slightly, but angle/position effects were overall weak-to-negligible. We conclude that, when high-performance algorithms such as AN are used, the angles or positions at which bugs are photographed seem unlikely to hinder cellphone picture-based automated identification of CD vectors, at least at the genus level. Future research should focus on combining mixed-quality pictures and state-of-the-art algorithms to (i) identify triatomine adults to the species level and (ii) distinguish triatomine nymphs (i.e., |
---|---|
ISSN: | 1574-9541 |
DOI: | 10.1016/j.ecoinf.2023.102430 |