The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models

Species distribution models are used for a range of ecological and evolutionary questions, but often are constructed from few and/or biased species occurrence records. Recent work has shown that the presence-only model Maxent performs well with small sample sizes. While the apparent accuracy of such...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Ecography (Copenhagen) 2012-03, Vol.35 (3), p.250-258
Hauptverfasser: Bean, William T., Stafford, Robert, Brashares, Justin S.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 258
container_issue 3
container_start_page 250
container_title Ecography (Copenhagen)
container_volume 35
creator Bean, William T.
Stafford, Robert
Brashares, Justin S.
description Species distribution models are used for a range of ecological and evolutionary questions, but often are constructed from few and/or biased species occurrence records. Recent work has shown that the presence-only model Maxent performs well with small sample sizes. While the apparent accuracy of such models with small samples has been studied, less emphasis has been placed on the effect of small or biased species records on the secondary modeling steps, specifically accuracy assessment and threshold selection, particularly with profile (presence-only) modeling techniques. When testing the effects of small sample sizes on distribution models, accuracy assessment has generally been conducted with complete species occurrence data, rather than similarly limited (e.g. few or biased) test data. Likewise, selection of a probability threshold - a selection of probability that classifies a model into discrete areas of presences and absences -has also generally been conducted with complete data. In this study we subsampled distribution data for an endangered rodent across multiple years to assess the effects of different sample sizes and types of bias on threshold selection, and examine the differences between apparent and actual accuracy of the models. Although some previously recommended threshold selection techniques showed little difference in threshold selection, the most commonly used methods performed poorly. Apparent model accuracy calculated from limited data was much higher than true model accuracy, but the true model accuracy was lower than it could have been with a more optimal threshold. That is, models with thresholds and accuracy calculated from biased and limited data had inflated reported accuracy, but were less accurate than they could have been if better data on species distribution were available and an optimal threshold were used.
doi_str_mv 10.1111/j.1600-0587.2011.06545.x
format Article
fullrecord <record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_968178879</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>41418661</jstor_id><sourcerecordid>41418661</sourcerecordid><originalsourceid>FETCH-LOGICAL-c5305-64a1ee6cbdfbb88d2860d9c0ee179cec894fcc4b8366eb8c7bf2ba52fe4e7df53</originalsourceid><addsrcrecordid>eNqNkUuP0zAUhSMEEmXgJyBZQohVgp3Ej2yQoBo6SKOZzaAuLce5VlOcpPgmmpYFvx2nGbpghTd-nO8eX92TJITRjMX1cZ8xQWlKuZJZThnLqOAlz47PktVFeJ6saEVFKnlFXyavEPeUsrwSapX8ftgBAefAjkgGR7Az3hM03cEDwfYXENM3f-91ayLUk3EXAHeDjwL4WNnGtxkz1k7B2BMxiIDYQT-ePQ9gW0DStDiGtp7OfDc04PF18sIZj_Dmab9Kvn-9fljfpLf3m2_rz7ep5QXlqSgNAxC2blxdK9XkStCmshSAycqCVVXprC1rVQgBtbKydnlteO6gBNk4XlwlHxbfQxh-ToCj7lq04L3pYZhQx1kwqZSsIvnuH3I_TKGPzWnGmWRSVmURKbVQNgyIAZw-hLYz4aQZ1XMueq_n8et5_HrORZ9z0cdY-v7pA4PWeBdMb1u81OeclxXN50Y-Ldxj6-H03_76en2_mY_R4O1isMdxCBeDkpVMCcGini56jAWOF92EH1rIQnK9vdtoqraFvNt-0TfFH1N1vHo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1517177943</pqid></control><display><type>article</type><title>The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models</title><source>Wiley Online Library Journals Frontfile Complete</source><source>Jstor Complete Legacy</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Bean, William T. ; Stafford, Robert ; Brashares, Justin S.</creator><creatorcontrib>Bean, William T. ; Stafford, Robert ; Brashares, Justin S.</creatorcontrib><description>Species distribution models are used for a range of ecological and evolutionary questions, but often are constructed from few and/or biased species occurrence records. Recent work has shown that the presence-only model Maxent performs well with small sample sizes. While the apparent accuracy of such models with small samples has been studied, less emphasis has been placed on the effect of small or biased species records on the secondary modeling steps, specifically accuracy assessment and threshold selection, particularly with profile (presence-only) modeling techniques. When testing the effects of small sample sizes on distribution models, accuracy assessment has generally been conducted with complete species occurrence data, rather than similarly limited (e.g. few or biased) test data. Likewise, selection of a probability threshold - a selection of probability that classifies a model into discrete areas of presences and absences -has also generally been conducted with complete data. In this study we subsampled distribution data for an endangered rodent across multiple years to assess the effects of different sample sizes and types of bias on threshold selection, and examine the differences between apparent and actual accuracy of the models. Although some previously recommended threshold selection techniques showed little difference in threshold selection, the most commonly used methods performed poorly. Apparent model accuracy calculated from limited data was much higher than true model accuracy, but the true model accuracy was lower than it could have been with a more optimal threshold. That is, models with thresholds and accuracy calculated from biased and limited data had inflated reported accuracy, but were less accurate than they could have been if better data on species distribution were available and an optimal threshold were used.</description><identifier>ISSN: 0906-7590</identifier><identifier>EISSN: 1600-0587</identifier><identifier>DOI: 10.1111/j.1600-0587.2011.06545.x</identifier><language>eng</language><publisher>Oxford, UK: Blackwell Publishing Ltd</publisher><subject>Accuracy ; Animal and plant ecology ; Animal, plant and microbial ecology ; Bias ; Biological and medical sciences ; Data models ; Data processing ; Ecological modeling ; Environmental conservation ; Fundamental and applied biological sciences. Psychology ; General aspects ; Pixels ; Population distributions ; Sample size ; Sampling bias ; Selection bias ; Spatial models ; Wildlife conservation</subject><ispartof>Ecography (Copenhagen), 2012-03, Vol.35 (3), p.250-258</ispartof><rights>Copyright © 2012 Ecography</rights><rights>2011 The Authors</rights><rights>2014 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c5305-64a1ee6cbdfbb88d2860d9c0ee179cec894fcc4b8366eb8c7bf2ba52fe4e7df53</citedby><cites>FETCH-LOGICAL-c5305-64a1ee6cbdfbb88d2860d9c0ee179cec894fcc4b8366eb8c7bf2ba52fe4e7df53</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/41418661$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/41418661$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,776,780,799,1411,27903,27904,45553,45554,57995,58228</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=25549029$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Bean, William T.</creatorcontrib><creatorcontrib>Stafford, Robert</creatorcontrib><creatorcontrib>Brashares, Justin S.</creatorcontrib><title>The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models</title><title>Ecography (Copenhagen)</title><addtitle>Ecography</addtitle><description>Species distribution models are used for a range of ecological and evolutionary questions, but often are constructed from few and/or biased species occurrence records. Recent work has shown that the presence-only model Maxent performs well with small sample sizes. While the apparent accuracy of such models with small samples has been studied, less emphasis has been placed on the effect of small or biased species records on the secondary modeling steps, specifically accuracy assessment and threshold selection, particularly with profile (presence-only) modeling techniques. When testing the effects of small sample sizes on distribution models, accuracy assessment has generally been conducted with complete species occurrence data, rather than similarly limited (e.g. few or biased) test data. Likewise, selection of a probability threshold - a selection of probability that classifies a model into discrete areas of presences and absences -has also generally been conducted with complete data. In this study we subsampled distribution data for an endangered rodent across multiple years to assess the effects of different sample sizes and types of bias on threshold selection, and examine the differences between apparent and actual accuracy of the models. Although some previously recommended threshold selection techniques showed little difference in threshold selection, the most commonly used methods performed poorly. Apparent model accuracy calculated from limited data was much higher than true model accuracy, but the true model accuracy was lower than it could have been with a more optimal threshold. That is, models with thresholds and accuracy calculated from biased and limited data had inflated reported accuracy, but were less accurate than they could have been if better data on species distribution were available and an optimal threshold were used.</description><subject>Accuracy</subject><subject>Animal and plant ecology</subject><subject>Animal, plant and microbial ecology</subject><subject>Bias</subject><subject>Biological and medical sciences</subject><subject>Data models</subject><subject>Data processing</subject><subject>Ecological modeling</subject><subject>Environmental conservation</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Pixels</subject><subject>Population distributions</subject><subject>Sample size</subject><subject>Sampling bias</subject><subject>Selection bias</subject><subject>Spatial models</subject><subject>Wildlife conservation</subject><issn>0906-7590</issn><issn>1600-0587</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNqNkUuP0zAUhSMEEmXgJyBZQohVgp3Ej2yQoBo6SKOZzaAuLce5VlOcpPgmmpYFvx2nGbpghTd-nO8eX92TJITRjMX1cZ8xQWlKuZJZThnLqOAlz47PktVFeJ6saEVFKnlFXyavEPeUsrwSapX8ftgBAefAjkgGR7Az3hM03cEDwfYXENM3f-91ayLUk3EXAHeDjwL4WNnGtxkz1k7B2BMxiIDYQT-ePQ9gW0DStDiGtp7OfDc04PF18sIZj_Dmab9Kvn-9fljfpLf3m2_rz7ep5QXlqSgNAxC2blxdK9XkStCmshSAycqCVVXprC1rVQgBtbKydnlteO6gBNk4XlwlHxbfQxh-ToCj7lq04L3pYZhQx1kwqZSsIvnuH3I_TKGPzWnGmWRSVmURKbVQNgyIAZw-hLYz4aQZ1XMueq_n8et5_HrORZ9z0cdY-v7pA4PWeBdMb1u81OeclxXN50Y-Ldxj6-H03_76en2_mY_R4O1isMdxCBeDkpVMCcGini56jAWOF92EH1rIQnK9vdtoqraFvNt-0TfFH1N1vHo</recordid><startdate>201203</startdate><enddate>201203</enddate><creator>Bean, William T.</creator><creator>Stafford, Robert</creator><creator>Brashares, Justin S.</creator><general>Blackwell Publishing Ltd</general><general>Blackwell Publishing</general><general>Blackwell</general><general>John Wiley &amp; Sons, Inc</general><scope>BSCLL</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SN</scope><scope>7SS</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>PATMY</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PYCSY</scope><scope>7ST</scope><scope>7U6</scope></search><sort><creationdate>201203</creationdate><title>The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models</title><author>Bean, William T. ; Stafford, Robert ; Brashares, Justin S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c5305-64a1ee6cbdfbb88d2860d9c0ee179cec894fcc4b8366eb8c7bf2ba52fe4e7df53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Accuracy</topic><topic>Animal and plant ecology</topic><topic>Animal, plant and microbial ecology</topic><topic>Bias</topic><topic>Biological and medical sciences</topic><topic>Data models</topic><topic>Data processing</topic><topic>Ecological modeling</topic><topic>Environmental conservation</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Pixels</topic><topic>Population distributions</topic><topic>Sample size</topic><topic>Sampling bias</topic><topic>Selection bias</topic><topic>Spatial models</topic><topic>Wildlife conservation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bean, William T.</creatorcontrib><creatorcontrib>Stafford, Robert</creatorcontrib><creatorcontrib>Brashares, Justin S.</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Agricultural &amp; Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Natural Science Collection (ProQuest)</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>Environmental Science Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Environmental Science Collection</collection><collection>Environment Abstracts</collection><collection>Sustainability Science Abstracts</collection><jtitle>Ecography (Copenhagen)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bean, William T.</au><au>Stafford, Robert</au><au>Brashares, Justin S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models</atitle><jtitle>Ecography (Copenhagen)</jtitle><addtitle>Ecography</addtitle><date>2012-03</date><risdate>2012</risdate><volume>35</volume><issue>3</issue><spage>250</spage><epage>258</epage><pages>250-258</pages><issn>0906-7590</issn><eissn>1600-0587</eissn><abstract>Species distribution models are used for a range of ecological and evolutionary questions, but often are constructed from few and/or biased species occurrence records. Recent work has shown that the presence-only model Maxent performs well with small sample sizes. While the apparent accuracy of such models with small samples has been studied, less emphasis has been placed on the effect of small or biased species records on the secondary modeling steps, specifically accuracy assessment and threshold selection, particularly with profile (presence-only) modeling techniques. When testing the effects of small sample sizes on distribution models, accuracy assessment has generally been conducted with complete species occurrence data, rather than similarly limited (e.g. few or biased) test data. Likewise, selection of a probability threshold - a selection of probability that classifies a model into discrete areas of presences and absences -has also generally been conducted with complete data. In this study we subsampled distribution data for an endangered rodent across multiple years to assess the effects of different sample sizes and types of bias on threshold selection, and examine the differences between apparent and actual accuracy of the models. Although some previously recommended threshold selection techniques showed little difference in threshold selection, the most commonly used methods performed poorly. Apparent model accuracy calculated from limited data was much higher than true model accuracy, but the true model accuracy was lower than it could have been with a more optimal threshold. That is, models with thresholds and accuracy calculated from biased and limited data had inflated reported accuracy, but were less accurate than they could have been if better data on species distribution were available and an optimal threshold were used.</abstract><cop>Oxford, UK</cop><pub>Blackwell Publishing Ltd</pub><doi>10.1111/j.1600-0587.2011.06545.x</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0906-7590
ispartof Ecography (Copenhagen), 2012-03, Vol.35 (3), p.250-258
issn 0906-7590
1600-0587
language eng
recordid cdi_proquest_miscellaneous_968178879
source Wiley Online Library Journals Frontfile Complete; Jstor Complete Legacy; EZB-FREE-00999 freely available EZB journals
subjects Accuracy
Animal and plant ecology
Animal, plant and microbial ecology
Bias
Biological and medical sciences
Data models
Data processing
Ecological modeling
Environmental conservation
Fundamental and applied biological sciences. Psychology
General aspects
Pixels
Population distributions
Sample size
Sampling bias
Selection bias
Spatial models
Wildlife conservation
title The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T14%3A38%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20effects%20of%20small%20sample%20size%20and%20sample%20bias%20on%20threshold%20selection%20and%20accuracy%20assessment%20of%20species%20distribution%20models&rft.jtitle=Ecography%20(Copenhagen)&rft.au=Bean,%20William%20T.&rft.date=2012-03&rft.volume=35&rft.issue=3&rft.spage=250&rft.epage=258&rft.pages=250-258&rft.issn=0906-7590&rft.eissn=1600-0587&rft_id=info:doi/10.1111/j.1600-0587.2011.06545.x&rft_dat=%3Cjstor_proqu%3E41418661%3C/jstor_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1517177943&rft_id=info:pmid/&rft_jstor_id=41418661&rfr_iscdi=true