Image‐based taxonomic classification of bulk insect biodiversity samples using deep learning and domain adaptation

Complex bulk samples of insects from biodiversity surveys present a challenge for taxonomic identification, which could be overcome by high‐throughput imaging combined with machine learning for rapid classification of specimens. These procedures require that taxonomic labels from an existing source...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Systematic entomology 2023-07, Vol.48 (3), p.387-401
Hauptverfasser:	Fujisawa, Tomochika, Noguerales, Víctor, Meramveliotakis, Emmanouil, Papadopoulou, Anna, Vogler, Alfried P.
Format:	Artikel
Sprache:	eng
Schlagworte:	Adaptation Algorithms Biodiversity biodiversity assessment bulk sample Classification Coleoptera convolutional neural network Deep learning domain adaptation Genera image classification machine learning Neural networks Predictions Standardization Taxonomy Transfer learning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	401
container_issue	3
container_start_page	387
container_title	Systematic entomology
container_volume	48
creator	Fujisawa, Tomochika Noguerales, Víctor Meramveliotakis, Emmanouil Papadopoulou, Anna Vogler, Alfried P.
description	Complex bulk samples of insects from biodiversity surveys present a challenge for taxonomic identification, which could be overcome by high‐throughput imaging combined with machine learning for rapid classification of specimens. These procedures require that taxonomic labels from an existing source data set are used for model training and prediction of an unknown target sample. However, such transfer learning may be problematic for the study of new samples not previously encountered in an image set, for example, from unexplored ecosystems, and require methods of domain adaptation that reduce the differences in the feature distribution of the source and target domains (training and test sets). We assessed the efficiency of domain adaptation for family‐level classification of bulk samples of Coleoptera, as a critical first step in the characterization of biodiversity samples. Neural network models trained with images from a global database of Coleoptera were applied to a biodiversity sample from understudied forests in Cyprus as the target. Within‐dataset classification accuracy reached 98% and depended on the number and quality of training images, and on dataset complexity. The accuracy of between‐datasets predictions (across disparate source–target pairs that do not share any species or genera) was at most 82% and depended greatly on the standardization of the imaging procedure. An algorithm for domain adaptation, domain adversarial training of neural networks (DANN), significantly improved the prediction performance of models trained by non‐standardized, low‐quality images. Our findings demonstrate that existing databases can be used to train models and successfully classify images from unexplored biota, but the imaging conditions and classification algorithms need careful consideration. We evaluated deep learning models for image classification under a realistic setting of biodiversity surveys, where models trained with global image sources predict samples from an unstudied target area. The accuracy of between‐datasets predictions (across disparate source–target pairs that do not share any species) was 82% and depended greatly on the standardization of the imaging procedure. An algorithm for domain adaptation, domain adversarial training of neural networks, significantly improved the prediction performance of models trained by non‐standardized, low‐quality images.
doi_str_mv	10.1111/syen.12583
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2822770955</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2822770955</sourcerecordid><originalsourceid>FETCH-LOGICAL-c3623-cbc83b02ec410f237e8e6cc4e6db104aa0ceb451352e0ef7c850ef4b99a45d443</originalsourceid><addsrcrecordid>eNp9kMtOwzAQRS0EEqWw4QsssUNK8SPPJaoKVKpgASxYWbYzqVwSO9gpkB2fwDfyJaQNa2ZzNdKZO9JB6JySGR3mKvRgZ5QlOT9AE8rTJOKU8kM0IZxkUVpk5BidhLAhhLAszSeoWzZyDT9f30oGKHEnP511jdFY1zIEUxktO-MsdhVW2_oVGxtAd1gZV5p38MF0PQ6yaWsIeBuMXeMSoMU1SG93m7QlLl0jjcWylG23bztFR5WsA5z95RQ93yye5nfR6uF2Ob9eRZqnjEda6ZwrwkDHlFSMZ5BDqnUMaakoiaUkGlScUJ4wIFBlOk-GiFVRyDgp45hP0cXY23r3toXQiY3beju8FCxnLMtIkSQDdTlS2rsQPFSi9aaRvheUiJ1VsbMq9lYHmI7wh6mh_4cUjy-L-_HmF2eefhU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2822770955</pqid></control><display><type>article</type><title>Image‐based taxonomic classification of bulk insect biodiversity samples using deep learning and domain adaptation</title><source>Wiley Online Library Journals Frontfile Complete</source><creator>Fujisawa, Tomochika ; Noguerales, Víctor ; Meramveliotakis, Emmanouil ; Papadopoulou, Anna ; Vogler, Alfried P.</creator><creatorcontrib>Fujisawa, Tomochika ; Noguerales, Víctor ; Meramveliotakis, Emmanouil ; Papadopoulou, Anna ; Vogler, Alfried P.</creatorcontrib><description>Complex bulk samples of insects from biodiversity surveys present a challenge for taxonomic identification, which could be overcome by high‐throughput imaging combined with machine learning for rapid classification of specimens. These procedures require that taxonomic labels from an existing source data set are used for model training and prediction of an unknown target sample. However, such transfer learning may be problematic for the study of new samples not previously encountered in an image set, for example, from unexplored ecosystems, and require methods of domain adaptation that reduce the differences in the feature distribution of the source and target domains (training and test sets). We assessed the efficiency of domain adaptation for family‐level classification of bulk samples of Coleoptera, as a critical first step in the characterization of biodiversity samples. Neural network models trained with images from a global database of Coleoptera were applied to a biodiversity sample from understudied forests in Cyprus as the target. Within‐dataset classification accuracy reached 98% and depended on the number and quality of training images, and on dataset complexity. The accuracy of between‐datasets predictions (across disparate source–target pairs that do not share any species or genera) was at most 82% and depended greatly on the standardization of the imaging procedure. An algorithm for domain adaptation, domain adversarial training of neural networks (DANN), significantly improved the prediction performance of models trained by non‐standardized, low‐quality images. Our findings demonstrate that existing databases can be used to train models and successfully classify images from unexplored biota, but the imaging conditions and classification algorithms need careful consideration. We evaluated deep learning models for image classification under a realistic setting of biodiversity surveys, where models trained with global image sources predict samples from an unstudied target area. The accuracy of between‐datasets predictions (across disparate source–target pairs that do not share any species) was 82% and depended greatly on the standardization of the imaging procedure. An algorithm for domain adaptation, domain adversarial training of neural networks, significantly improved the prediction performance of models trained by non‐standardized, low‐quality images.</description><identifier>ISSN: 0307-6970</identifier><identifier>EISSN: 1365-3113</identifier><identifier>DOI: 10.1111/syen.12583</identifier><language>eng</language><publisher>Oxford, UK: Blackwell Publishing Ltd</publisher><subject>Adaptation ; Algorithms ; Biodiversity ; biodiversity assessment ; bulk sample ; Classification ; Coleoptera ; convolutional neural network ; Deep learning ; domain adaptation ; Genera ; image classification ; machine learning ; Neural networks ; Predictions ; Standardization ; Taxonomy ; Transfer learning</subject><ispartof>Systematic entomology, 2023-07, Vol.48 (3), p.387-401</ispartof><rights>2023 The Authors. published by John Wiley & Sons Ltd on behalf of Royal Entomological Society.</rights><rights>2023. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c3623-cbc83b02ec410f237e8e6cc4e6db104aa0ceb451352e0ef7c850ef4b99a45d443</cites><orcidid>0000-0002-6399-575X ; 0000-0002-2462-3718 ; 0000-0002-4656-4894 ; 0000-0002-4611-3727 ; 0000-0003-3185-778X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1111%2Fsyen.12583$$EPDF$$P50$$Gwiley$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1111%2Fsyen.12583$$EHTML$$P50$$Gwiley$$Hfree_for_read</linktohtml><link.rule.ids>314,777,781,1412,27905,27906,45555,45556</link.rule.ids></links><search><creatorcontrib>Fujisawa, Tomochika</creatorcontrib><creatorcontrib>Noguerales, Víctor</creatorcontrib><creatorcontrib>Meramveliotakis, Emmanouil</creatorcontrib><creatorcontrib>Papadopoulou, Anna</creatorcontrib><creatorcontrib>Vogler, Alfried P.</creatorcontrib><title>Image‐based taxonomic classification of bulk insect biodiversity samples using deep learning and domain adaptation</title><title>Systematic entomology</title><description>Complex bulk samples of insects from biodiversity surveys present a challenge for taxonomic identification, which could be overcome by high‐throughput imaging combined with machine learning for rapid classification of specimens. These procedures require that taxonomic labels from an existing source data set are used for model training and prediction of an unknown target sample. However, such transfer learning may be problematic for the study of new samples not previously encountered in an image set, for example, from unexplored ecosystems, and require methods of domain adaptation that reduce the differences in the feature distribution of the source and target domains (training and test sets). We assessed the efficiency of domain adaptation for family‐level classification of bulk samples of Coleoptera, as a critical first step in the characterization of biodiversity samples. Neural network models trained with images from a global database of Coleoptera were applied to a biodiversity sample from understudied forests in Cyprus as the target. Within‐dataset classification accuracy reached 98% and depended on the number and quality of training images, and on dataset complexity. The accuracy of between‐datasets predictions (across disparate source–target pairs that do not share any species or genera) was at most 82% and depended greatly on the standardization of the imaging procedure. An algorithm for domain adaptation, domain adversarial training of neural networks (DANN), significantly improved the prediction performance of models trained by non‐standardized, low‐quality images. Our findings demonstrate that existing databases can be used to train models and successfully classify images from unexplored biota, but the imaging conditions and classification algorithms need careful consideration. We evaluated deep learning models for image classification under a realistic setting of biodiversity surveys, where models trained with global image sources predict samples from an unstudied target area. The accuracy of between‐datasets predictions (across disparate source–target pairs that do not share any species) was 82% and depended greatly on the standardization of the imaging procedure. An algorithm for domain adaptation, domain adversarial training of neural networks, significantly improved the prediction performance of models trained by non‐standardized, low‐quality images.</description><subject>Adaptation</subject><subject>Algorithms</subject><subject>Biodiversity</subject><subject>biodiversity assessment</subject><subject>bulk sample</subject><subject>Classification</subject><subject>Coleoptera</subject><subject>convolutional neural network</subject><subject>Deep learning</subject><subject>domain adaptation</subject><subject>Genera</subject><subject>image classification</subject><subject>machine learning</subject><subject>Neural networks</subject><subject>Predictions</subject><subject>Standardization</subject><subject>Taxonomy</subject><subject>Transfer learning</subject><issn>0307-6970</issn><issn>1365-3113</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>24P</sourceid><sourceid>WIN</sourceid><recordid>eNp9kMtOwzAQRS0EEqWw4QsssUNK8SPPJaoKVKpgASxYWbYzqVwSO9gpkB2fwDfyJaQNa2ZzNdKZO9JB6JySGR3mKvRgZ5QlOT9AE8rTJOKU8kM0IZxkUVpk5BidhLAhhLAszSeoWzZyDT9f30oGKHEnP511jdFY1zIEUxktO-MsdhVW2_oVGxtAd1gZV5p38MF0PQ6yaWsIeBuMXeMSoMU1SG93m7QlLl0jjcWylG23bztFR5WsA5z95RQ93yye5nfR6uF2Ob9eRZqnjEda6ZwrwkDHlFSMZ5BDqnUMaakoiaUkGlScUJ4wIFBlOk-GiFVRyDgp45hP0cXY23r3toXQiY3beju8FCxnLMtIkSQDdTlS2rsQPFSi9aaRvheUiJ1VsbMq9lYHmI7wh6mh_4cUjy-L-_HmF2eefhU</recordid><startdate>202307</startdate><enddate>202307</enddate><creator>Fujisawa, Tomochika</creator><creator>Noguerales, Víctor</creator><creator>Meramveliotakis, Emmanouil</creator><creator>Papadopoulou, Anna</creator><creator>Vogler, Alfried P.</creator><general>Blackwell Publishing Ltd</general><general>Wiley Subscription Services, Inc</general><scope>24P</scope><scope>WIN</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SS</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>RC3</scope><orcidid>https://orcid.org/0000-0002-6399-575X</orcidid><orcidid>https://orcid.org/0000-0002-2462-3718</orcidid><orcidid>https://orcid.org/0000-0002-4656-4894</orcidid><orcidid>https://orcid.org/0000-0002-4611-3727</orcidid><orcidid>https://orcid.org/0000-0003-3185-778X</orcidid></search><sort><creationdate>202307</creationdate><title>Image‐based taxonomic classification of bulk insect biodiversity samples using deep learning and domain adaptation</title><author>Fujisawa, Tomochika ; Noguerales, Víctor ; Meramveliotakis, Emmanouil ; Papadopoulou, Anna ; Vogler, Alfried P.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c3623-cbc83b02ec410f237e8e6cc4e6db104aa0ceb451352e0ef7c850ef4b99a45d443</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Adaptation</topic><topic>Algorithms</topic><topic>Biodiversity</topic><topic>biodiversity assessment</topic><topic>bulk sample</topic><topic>Classification</topic><topic>Coleoptera</topic><topic>convolutional neural network</topic><topic>Deep learning</topic><topic>domain adaptation</topic><topic>Genera</topic><topic>image classification</topic><topic>machine learning</topic><topic>Neural networks</topic><topic>Predictions</topic><topic>Standardization</topic><topic>Taxonomy</topic><topic>Transfer learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fujisawa, Tomochika</creatorcontrib><creatorcontrib>Noguerales, Víctor</creatorcontrib><creatorcontrib>Meramveliotakis, Emmanouil</creatorcontrib><creatorcontrib>Papadopoulou, Anna</creatorcontrib><creatorcontrib>Vogler, Alfried P.</creatorcontrib><collection>Wiley Online Library Open Access</collection><collection>Wiley Free Content</collection><collection>CrossRef</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><jtitle>Systematic entomology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fujisawa, Tomochika</au><au>Noguerales, Víctor</au><au>Meramveliotakis, Emmanouil</au><au>Papadopoulou, Anna</au><au>Vogler, Alfried P.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Image‐based taxonomic classification of bulk insect biodiversity samples using deep learning and domain adaptation</atitle><jtitle>Systematic entomology</jtitle><date>2023-07</date><risdate>2023</risdate><volume>48</volume><issue>3</issue><spage>387</spage><epage>401</epage><pages>387-401</pages><issn>0307-6970</issn><eissn>1365-3113</eissn><abstract>Complex bulk samples of insects from biodiversity surveys present a challenge for taxonomic identification, which could be overcome by high‐throughput imaging combined with machine learning for rapid classification of specimens. These procedures require that taxonomic labels from an existing source data set are used for model training and prediction of an unknown target sample. However, such transfer learning may be problematic for the study of new samples not previously encountered in an image set, for example, from unexplored ecosystems, and require methods of domain adaptation that reduce the differences in the feature distribution of the source and target domains (training and test sets). We assessed the efficiency of domain adaptation for family‐level classification of bulk samples of Coleoptera, as a critical first step in the characterization of biodiversity samples. Neural network models trained with images from a global database of Coleoptera were applied to a biodiversity sample from understudied forests in Cyprus as the target. Within‐dataset classification accuracy reached 98% and depended on the number and quality of training images, and on dataset complexity. The accuracy of between‐datasets predictions (across disparate source–target pairs that do not share any species or genera) was at most 82% and depended greatly on the standardization of the imaging procedure. An algorithm for domain adaptation, domain adversarial training of neural networks (DANN), significantly improved the prediction performance of models trained by non‐standardized, low‐quality images. Our findings demonstrate that existing databases can be used to train models and successfully classify images from unexplored biota, but the imaging conditions and classification algorithms need careful consideration. We evaluated deep learning models for image classification under a realistic setting of biodiversity surveys, where models trained with global image sources predict samples from an unstudied target area. The accuracy of between‐datasets predictions (across disparate source–target pairs that do not share any species) was 82% and depended greatly on the standardization of the imaging procedure. An algorithm for domain adaptation, domain adversarial training of neural networks, significantly improved the prediction performance of models trained by non‐standardized, low‐quality images.</abstract><cop>Oxford, UK</cop><pub>Blackwell Publishing Ltd</pub><doi>10.1111/syen.12583</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0002-6399-575X</orcidid><orcidid>https://orcid.org/0000-0002-2462-3718</orcidid><orcidid>https://orcid.org/0000-0002-4656-4894</orcidid><orcidid>https://orcid.org/0000-0002-4611-3727</orcidid><orcidid>https://orcid.org/0000-0003-3185-778X</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0307-6970
ispartof	Systematic entomology, 2023-07, Vol.48 (3), p.387-401
issn	0307-6970 1365-3113
language	eng
recordid	cdi_proquest_journals_2822770955
source	Wiley Online Library Journals Frontfile Complete
subjects	Adaptation Algorithms Biodiversity biodiversity assessment bulk sample Classification Coleoptera convolutional neural network Deep learning domain adaptation Genera image classification machine learning Neural networks Predictions Standardization Taxonomy Transfer learning
title	Image‐based taxonomic classification of bulk insect biodiversity samples using deep learning and domain adaptation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T13%3A33%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Image%E2%80%90based%20taxonomic%20classification%20of%20bulk%20insect%20biodiversity%20samples%20using%20deep%20learning%20and%20domain%20adaptation&rft.jtitle=Systematic%20entomology&rft.au=Fujisawa,%20Tomochika&rft.date=2023-07&rft.volume=48&rft.issue=3&rft.spage=387&rft.epage=401&rft.pages=387-401&rft.issn=0307-6970&rft.eissn=1365-3113&rft_id=info:doi/10.1111/syen.12583&rft_dat=%3Cproquest_cross%3E2822770955%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2822770955&rft_id=info:pmid/&rfr_iscdi=true