Metrics to guide development of machine learning algorithms for malaria diagnosis
Automated malaria diagnosis is a difficult but high-value target for machine learning (ML), and effective algorithms could save many thousands of children's lives. However, current ML efforts largely neglect crucial use case constraints and are thus not clinically useful. Two factors in particu...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Delahunt, Charles B Gachuhi, Noni Horning, Matthew P |
description | Automated malaria diagnosis is a difficult but high-value target for machine
learning (ML), and effective algorithms could save many thousands of children's
lives. However, current ML efforts largely neglect crucial use case constraints
and are thus not clinically useful. Two factors in particular are crucial to
developing algorithms translatable to clinical field settings: (i) Clear
understanding of the clinical needs that ML solutions must accommodate; and
(ii) task-relevant metrics for guiding and evaluating ML models. Neglect of
these factors has seriously hampered past ML work on malaria, because the
resulting algorithms do not align with clinical needs.
In this paper we address these two issues in the context of automated malaria
diagnosis via microscopy on Giemsa-stained blood films. First, we describe why
domain expertise is crucial to effectively apply ML to malaria, and list
technical documents and other resources that provide this domain knowledge.
Second, we detail performance metrics tailored to the clinical requirements of
malaria diagnosis, to guide development of ML models and evaluate model
performance through the lens of clinical needs (versus a generic ML lens). We
highlight the importance of a patient-level perspective, interpatient
variability, false positive rates, limit of detection, and different types of
error. We also discuss reasons why ROC curves, AUC, and F1, as commonly used in
ML work, are poorly suited to this context. These findings also apply to other
diseases involving parasite loads, including neglected tropical diseases (NTDs)
such as schistosomiasis. |
doi_str_mv | 10.48550/arxiv.2209.06947 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2209_06947</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2209_06947</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-c9c621b78258125019b0976a034cb2c2ae04f69d7d6052f525cee3d8cc90659f3</originalsourceid><addsrcrecordid>eNotz81KxDAUhuFsXMjoBbgyN9B6mjZJs5TBPxgRYfblNDnpBNJkSOugd6-Orr7FCx88jN00UHe9lHCH5TOcaiHA1KBMpy_Z-yutJdiFr5lPH8ERd3SimI8zpZVnz2e0h5CIR8KSQpo4ximXsB7mhftcfnrEEpC7gFPKS1iu2IXHuND1_27Y_vFhv32udm9PL9v7XYVK68oaq0Qz6l7IvhESGjOC0Qqh7eworECCzivjtFMghZdCWqLW9dYaUNL4dsNu_27PpuFYwozla_i1DWdb-w3c10od</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Metrics to guide development of machine learning algorithms for malaria diagnosis</title><source>arXiv.org</source><creator>Delahunt, Charles B ; Gachuhi, Noni ; Horning, Matthew P</creator><creatorcontrib>Delahunt, Charles B ; Gachuhi, Noni ; Horning, Matthew P</creatorcontrib><description>Automated malaria diagnosis is a difficult but high-value target for machine
learning (ML), and effective algorithms could save many thousands of children's
lives. However, current ML efforts largely neglect crucial use case constraints
and are thus not clinically useful. Two factors in particular are crucial to
developing algorithms translatable to clinical field settings: (i) Clear
understanding of the clinical needs that ML solutions must accommodate; and
(ii) task-relevant metrics for guiding and evaluating ML models. Neglect of
these factors has seriously hampered past ML work on malaria, because the
resulting algorithms do not align with clinical needs.
In this paper we address these two issues in the context of automated malaria
diagnosis via microscopy on Giemsa-stained blood films. First, we describe why
domain expertise is crucial to effectively apply ML to malaria, and list
technical documents and other resources that provide this domain knowledge.
Second, we detail performance metrics tailored to the clinical requirements of
malaria diagnosis, to guide development of ML models and evaluate model
performance through the lens of clinical needs (versus a generic ML lens). We
highlight the importance of a patient-level perspective, interpatient
variability, false positive rates, limit of detection, and different types of
error. We also discuss reasons why ROC curves, AUC, and F1, as commonly used in
ML work, are poorly suited to this context. These findings also apply to other
diseases involving parasite loads, including neglected tropical diseases (NTDs)
such as schistosomiasis.</description><identifier>DOI: 10.48550/arxiv.2209.06947</identifier><language>eng</language><subject>Computer Science - Learning</subject><creationdate>2022-09</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2209.06947$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2209.06947$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Delahunt, Charles B</creatorcontrib><creatorcontrib>Gachuhi, Noni</creatorcontrib><creatorcontrib>Horning, Matthew P</creatorcontrib><title>Metrics to guide development of machine learning algorithms for malaria diagnosis</title><description>Automated malaria diagnosis is a difficult but high-value target for machine
learning (ML), and effective algorithms could save many thousands of children's
lives. However, current ML efforts largely neglect crucial use case constraints
and are thus not clinically useful. Two factors in particular are crucial to
developing algorithms translatable to clinical field settings: (i) Clear
understanding of the clinical needs that ML solutions must accommodate; and
(ii) task-relevant metrics for guiding and evaluating ML models. Neglect of
these factors has seriously hampered past ML work on malaria, because the
resulting algorithms do not align with clinical needs.
In this paper we address these two issues in the context of automated malaria
diagnosis via microscopy on Giemsa-stained blood films. First, we describe why
domain expertise is crucial to effectively apply ML to malaria, and list
technical documents and other resources that provide this domain knowledge.
Second, we detail performance metrics tailored to the clinical requirements of
malaria diagnosis, to guide development of ML models and evaluate model
performance through the lens of clinical needs (versus a generic ML lens). We
highlight the importance of a patient-level perspective, interpatient
variability, false positive rates, limit of detection, and different types of
error. We also discuss reasons why ROC curves, AUC, and F1, as commonly used in
ML work, are poorly suited to this context. These findings also apply to other
diseases involving parasite loads, including neglected tropical diseases (NTDs)
such as schistosomiasis.</description><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz81KxDAUhuFsXMjoBbgyN9B6mjZJs5TBPxgRYfblNDnpBNJkSOugd6-Orr7FCx88jN00UHe9lHCH5TOcaiHA1KBMpy_Z-yutJdiFr5lPH8ERd3SimI8zpZVnz2e0h5CIR8KSQpo4ximXsB7mhftcfnrEEpC7gFPKS1iu2IXHuND1_27Y_vFhv32udm9PL9v7XYVK68oaq0Qz6l7IvhESGjOC0Qqh7eworECCzivjtFMghZdCWqLW9dYaUNL4dsNu_27PpuFYwozla_i1DWdb-w3c10od</recordid><startdate>20220914</startdate><enddate>20220914</enddate><creator>Delahunt, Charles B</creator><creator>Gachuhi, Noni</creator><creator>Horning, Matthew P</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20220914</creationdate><title>Metrics to guide development of machine learning algorithms for malaria diagnosis</title><author>Delahunt, Charles B ; Gachuhi, Noni ; Horning, Matthew P</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-c9c621b78258125019b0976a034cb2c2ae04f69d7d6052f525cee3d8cc90659f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Delahunt, Charles B</creatorcontrib><creatorcontrib>Gachuhi, Noni</creatorcontrib><creatorcontrib>Horning, Matthew P</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Delahunt, Charles B</au><au>Gachuhi, Noni</au><au>Horning, Matthew P</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Metrics to guide development of machine learning algorithms for malaria diagnosis</atitle><date>2022-09-14</date><risdate>2022</risdate><abstract>Automated malaria diagnosis is a difficult but high-value target for machine
learning (ML), and effective algorithms could save many thousands of children's
lives. However, current ML efforts largely neglect crucial use case constraints
and are thus not clinically useful. Two factors in particular are crucial to
developing algorithms translatable to clinical field settings: (i) Clear
understanding of the clinical needs that ML solutions must accommodate; and
(ii) task-relevant metrics for guiding and evaluating ML models. Neglect of
these factors has seriously hampered past ML work on malaria, because the
resulting algorithms do not align with clinical needs.
In this paper we address these two issues in the context of automated malaria
diagnosis via microscopy on Giemsa-stained blood films. First, we describe why
domain expertise is crucial to effectively apply ML to malaria, and list
technical documents and other resources that provide this domain knowledge.
Second, we detail performance metrics tailored to the clinical requirements of
malaria diagnosis, to guide development of ML models and evaluate model
performance through the lens of clinical needs (versus a generic ML lens). We
highlight the importance of a patient-level perspective, interpatient
variability, false positive rates, limit of detection, and different types of
error. We also discuss reasons why ROC curves, AUC, and F1, as commonly used in
ML work, are poorly suited to this context. These findings also apply to other
diseases involving parasite loads, including neglected tropical diseases (NTDs)
such as schistosomiasis.</abstract><doi>10.48550/arxiv.2209.06947</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2209.06947 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2209_06947 |
source | arXiv.org |
subjects | Computer Science - Learning |
title | Metrics to guide development of machine learning algorithms for malaria diagnosis |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T20%3A55%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Metrics%20to%20guide%20development%20of%20machine%20learning%20algorithms%20for%20malaria%20diagnosis&rft.au=Delahunt,%20Charles%20B&rft.date=2022-09-14&rft_id=info:doi/10.48550/arxiv.2209.06947&rft_dat=%3Carxiv_GOX%3E2209_06947%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |