The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models

Species distribution models are used for a range of ecological and evolutionary questions, but often are constructed from few and/or biased species occurrence records. Recent work has shown that the presence-only model Maxent performs well with small sample sizes. While the apparent accuracy of such...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Ecography (Copenhagen) 2012-03, Vol.35 (3), p.250-258
Hauptverfasser:	Bean, William T., Stafford, Robert, Brashares, Justin S.
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Animal and plant ecology Animal, plant and microbial ecology Bias Biological and medical sciences Data models Data processing Ecological modeling Environmental conservation Fundamental and applied biological sciences. Psychology General aspects Pixels Population distributions Sample size Sampling bias Selection bias Spatial models Wildlife conservation
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	258
container_issue	3
container_start_page	250
container_title	Ecography (Copenhagen)
container_volume	35
creator	Bean, William T. Stafford, Robert Brashares, Justin S.
description	Species distribution models are used for a range of ecological and evolutionary questions, but often are constructed from few and/or biased species occurrence records. Recent work has shown that the presence-only model Maxent performs well with small sample sizes. While the apparent accuracy of such models with small samples has been studied, less emphasis has been placed on the effect of small or biased species records on the secondary modeling steps, specifically accuracy assessment and threshold selection, particularly with profile (presence-only) modeling techniques. When testing the effects of small sample sizes on distribution models, accuracy assessment has generally been conducted with complete species occurrence data, rather than similarly limited (e.g. few or biased) test data. Likewise, selection of a probability threshold - a selection of probability that classifies a model into discrete areas of presences and absences -has also generally been conducted with complete data. In this study we subsampled distribution data for an endangered rodent across multiple years to assess the effects of different sample sizes and types of bias on threshold selection, and examine the differences between apparent and actual accuracy of the models. Although some previously recommended threshold selection techniques showed little difference in threshold selection, the most commonly used methods performed poorly. Apparent model accuracy calculated from limited data was much higher than true model accuracy, but the true model accuracy was lower than it could have been with a more optimal threshold. That is, models with thresholds and accuracy calculated from biased and limited data had inflated reported accuracy, but were less accurate than they could have been if better data on species distribution were available and an optimal threshold were used.
doi_str_mv	10.1111/j.1600-0587.2011.06545.x
format	Article
fullrecord	<record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_968178879</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>41418661</jstor_id><sourcerecordid>41418661</sourcerecordid><originalsourceid>FETCH-LOGICAL-c5305-64a1ee6cbdfbb88d2860d9c0ee179cec894fcc4b8366eb8c7bf2ba52fe4e7df53</originalsourceid><addsrcrecordid>eNqNkUuP0zAUhSMEEmXgJyBZQohVgp3Ej2yQoBo6SKOZzaAuLce5VlOcpPgmmpYFvx2nGbpghTd-nO8eX92TJITRjMX1cZ8xQWlKuZJZThnLqOAlz47PktVFeJ6saEVFKnlFXyavEPeUsrwSapX8ftgBAefAjkgGR7Az3hM03cEDwfYXENM3f-91ayLUk3EXAHeDjwL4WNnGtxkz1k7B2BMxiIDYQT-ePQ9gW0DStDiGtp7OfDc04PF18sIZj_Dmab9Kvn-9fljfpLf3m2_rz7ep5QXlqSgNAxC2blxdK9XkStCmshSAycqCVVXprC1rVQgBtbKydnlteO6gBNk4XlwlHxbfQxh-ToCj7lq04L3pYZhQx1kwqZSsIvnuH3I_TKGPzWnGmWRSVmURKbVQNgyIAZw-hLYz4aQZ1XMueq_n8et5_HrORZ9z0cdY-v7pA4PWeBdMb1u81OeclxXN50Y-Ldxj6-H03_76en2_mY_R4O1isMdxCBeDkpVMCcGini56jAWOF92EH1rIQnK9vdtoqraFvNt-0TfFH1N1vHo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1517177943</pqid></control><display><type>article</type><title>The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models</title><source>Wiley Online Library Journals Frontfile Complete</source><source>Jstor Complete Legacy</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Bean, William T. ; Stafford, Robert ; Brashares, Justin S.</creator><creatorcontrib>Bean, William T. ; Stafford, Robert ; Brashares, Justin S.</creatorcontrib><description>Species distribution models are used for a range of ecological and evolutionary questions, but often are constructed from few and/or biased species occurrence records. Recent work has shown that the presence-only model Maxent performs well with small sample sizes. While the apparent accuracy of such models with small samples has been studied, less emphasis has been placed on the effect of small or biased species records on the secondary modeling steps, specifically accuracy assessment and threshold selection, particularly with profile (presence-only) modeling techniques. When testing the effects of small sample sizes on distribution models, accuracy assessment has generally been conducted with complete species occurrence data, rather than similarly limited (e.g. few or biased) test data. Likewise, selection of a probability threshold - a selection of probability that classifies a model into discrete areas of presences and absences -has also generally been conducted with complete data. In this study we subsampled distribution data for an endangered rodent across multiple years to assess the effects of different sample sizes and types of bias on threshold selection, and examine the differences between apparent and actual accuracy of the models. Although some previously recommended threshold selection techniques showed little difference in threshold selection, the most commonly used methods performed poorly. Apparent model accuracy calculated from limited data was much higher than true model accuracy, but the true model accuracy was lower than it could have been with a more optimal threshold. That is, models with thresholds and accuracy calculated from biased and limited data had inflated reported accuracy, but were less accurate than they could have been if better data on species distribution were available and an optimal threshold were used.</description><identifier>ISSN: 0906-7590</identifier><identifier>EISSN: 1600-0587</identifier><identifier>DOI: 10.1111/j.1600-0587.2011.06545.x</identifier><language>eng</language><publisher>Oxford, UK: Blackwell Publishing Ltd</publisher><subject>Accuracy ; Animal and plant ecology ; Animal, plant and microbial ecology ; Bias ; Biological and medical sciences ; Data models ; Data processing ; Ecological modeling ; Environmental conservation ; Fundamental and applied biological sciences. Psychology ; General aspects ; Pixels ; Population distributions ; Sample size ; Sampling bias ; Selection bias ; Spatial models ; Wildlife conservation</subject><ispartof>Ecography (Copenhagen), 2012-03, Vol.35 (3), p.250-258</ispartof><rights>Copyright © 2012 Ecography</rights><rights>2011 The Authors</rights><rights>2014 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c5305-64a1ee6cbdfbb88d2860d9c0ee179cec894fcc4b8366eb8c7bf2ba52fe4e7df53</citedby><cites>FETCH-LOGICAL-c5305-64a1ee6cbdfbb88d2860d9c0ee179cec894fcc4b8366eb8c7bf2ba52fe4e7df53</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/41418661$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/41418661$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,776,780,799,1411,27903,27904,45553,45554,57995,58228</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=25549029$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Bean, William T.</creatorcontrib><creatorcontrib>Stafford, Robert</creatorcontrib><creatorcontrib>Brashares, Justin S.</creatorcontrib><title>The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models</title><title>Ecography (Copenhagen)</title><addtitle>Ecography</addtitle><description>Species distribution models are used for a range of ecological and evolutionary questions, but often are constructed from few and/or biased species occurrence records. Recent work has shown that the presence-only model Maxent performs well with small sample sizes. While the apparent accuracy of such models with small samples has been studied, less emphasis has been placed on the effect of small or biased species records on the secondary modeling steps, specifically accuracy assessment and threshold selection, particularly with profile (presence-only) modeling techniques. When testing the effects of small sample sizes on distribution models, accuracy assessment has generally been conducted with complete species occurrence data, rather than similarly limited (e.g. few or biased) test data. Likewise, selection of a probability threshold - a selection of probability that classifies a model into discrete areas of presences and absences -has also generally been conducted with complete data. In this study we subsampled distribution data for an endangered rodent across multiple years to assess the effects of different sample sizes and types of bias on threshold selection, and examine the differences between apparent and actual accuracy of the models. Although some previously recommended threshold selection techniques showed little difference in threshold selection, the most commonly used methods performed poorly. Apparent model accuracy calculated from limited data was much higher than true model accuracy, but the true model accuracy was lower than it could have been with a more optimal threshold. That is, models with thresholds and accuracy calculated from biased and limited data had inflated reported accuracy, but were less accurate than they could have been if better data on species distribution were available and an optimal threshold were used.</description><subject>Accuracy</subject><subject>Animal and plant ecology</subject><subject>Animal, plant and microbial ecology</subject><subject>Bias</subject><subject>Biological and medical sciences</subject><subject>Data models</subject><subject>Data processing</subject><subject>Ecological modeling</subject><subject>Environmental conservation</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Pixels</subject><subject>Population distributions</subject><subject>Sample size</subject><subject>Sampling bias</subject><subject>Selection bias</subject><subject>Spatial models</subject><subject>Wildlife conservation</subject><issn>0906-7590</issn><issn>1600-0587</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNqNkUuP0zAUhSMEEmXgJyBZQohVgp3Ej2yQoBo6SKOZzaAuLce5VlOcpPgmmpYFvx2nGbpghTd-nO8eX92TJITRjMX1cZ8xQWlKuZJZThnLqOAlz47PktVFeJ6saEVFKnlFXyavEPeUsrwSapX8ftgBAefAjkgGR7Az3hM03cEDwfYXENM3f-91ayLUk3EXAHeDjwL4WNnGtxkz1k7B2BMxiIDYQT-ePQ9gW0DStDiGtp7OfDc04PF18sIZj_Dmab9Kvn-9fljfpLf3m2_rz7ep5QXlqSgNAxC2blxdK9XkStCmshSAycqCVVXprC1rVQgBtbKydnlteO6gBNk4XlwlHxbfQxh-ToCj7lq04L3pYZhQx1kwqZSsIvnuH3I_TKGPzWnGmWRSVmURKbVQNgyIAZw-hLYz4aQZ1XMueq_n8et5_HrORZ9z0cdY-v7pA4PWeBdMb1u81OeclxXN50Y-Ldxj6-H03_76en2_mY_R4O1isMdxCBeDkpVMCcGini56jAWOF92EH1rIQnK9vdtoqraFvNt-0TfFH1N1vHo</recordid><startdate>201203</startdate><enddate>201203</enddate><creator>Bean, William T.</creator><creator>Stafford, Robert</creator><creator>Brashares, Justin S.</creator><general>Blackwell Publishing Ltd</general><general>Blackwell Publishing</general><general>Blackwell</general><general>John Wiley & Sons, Inc</general><scope>BSCLL</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SN</scope><scope>7SS</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>PATMY</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PYCSY</scope><scope>7ST</scope><scope>7U6</scope></search><sort><creationdate>201203</creationdate><title>The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models</title><author>Bean, William T. ; Stafford, Robert ; Brashares, Justin S.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c5305-64a1ee6cbdfbb88d2860d9c0ee179cec894fcc4b8366eb8c7bf2ba52fe4e7df53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Accuracy</topic><topic>Animal and plant ecology</topic><topic>Animal, plant and microbial ecology</topic><topic>Bias</topic><topic>Biological and medical sciences</topic><topic>Data models</topic><topic>Data processing</topic><topic>Ecological modeling</topic><topic>Environmental conservation</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Pixels</topic><topic>Population distributions</topic><topic>Sample size</topic><topic>Sampling bias</topic><topic>Selection bias</topic><topic>Spatial models</topic><topic>Wildlife conservation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bean, William T.</creatorcontrib><creatorcontrib>Stafford, Robert</creatorcontrib><creatorcontrib>Brashares, Justin S.</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Natural Science Collection (ProQuest)</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>Environmental Science Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Environmental Science Collection</collection><collection>Environment Abstracts</collection><collection>Sustainability Science Abstracts</collection><jtitle>Ecography (Copenhagen)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bean, William T.</au><au>Stafford, Robert</au><au>Brashares, Justin S.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models</atitle><jtitle>Ecography (Copenhagen)</jtitle><addtitle>Ecography</addtitle><date>2012-03</date><risdate>2012</risdate><volume>35</volume><issue>3</issue><spage>250</spage><epage>258</epage><pages>250-258</pages><issn>0906-7590</issn><eissn>1600-0587</eissn><abstract>Species distribution models are used for a range of ecological and evolutionary questions, but often are constructed from few and/or biased species occurrence records. Recent work has shown that the presence-only model Maxent performs well with small sample sizes. While the apparent accuracy of such models with small samples has been studied, less emphasis has been placed on the effect of small or biased species records on the secondary modeling steps, specifically accuracy assessment and threshold selection, particularly with profile (presence-only) modeling techniques. When testing the effects of small sample sizes on distribution models, accuracy assessment has generally been conducted with complete species occurrence data, rather than similarly limited (e.g. few or biased) test data. Likewise, selection of a probability threshold - a selection of probability that classifies a model into discrete areas of presences and absences -has also generally been conducted with complete data. In this study we subsampled distribution data for an endangered rodent across multiple years to assess the effects of different sample sizes and types of bias on threshold selection, and examine the differences between apparent and actual accuracy of the models. Although some previously recommended threshold selection techniques showed little difference in threshold selection, the most commonly used methods performed poorly. Apparent model accuracy calculated from limited data was much higher than true model accuracy, but the true model accuracy was lower than it could have been with a more optimal threshold. That is, models with thresholds and accuracy calculated from biased and limited data had inflated reported accuracy, but were less accurate than they could have been if better data on species distribution were available and an optimal threshold were used.</abstract><cop>Oxford, UK</cop><pub>Blackwell Publishing Ltd</pub><doi>10.1111/j.1600-0587.2011.06545.x</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0906-7590
ispartof	Ecography (Copenhagen), 2012-03, Vol.35 (3), p.250-258
issn	0906-7590 1600-0587
language	eng
recordid	cdi_proquest_miscellaneous_968178879
source	Wiley Online Library Journals Frontfile Complete; Jstor Complete Legacy; EZB-FREE-00999 freely available EZB journals
subjects	Accuracy Animal and plant ecology Animal, plant and microbial ecology Bias Biological and medical sciences Data models Data processing Ecological modeling Environmental conservation Fundamental and applied biological sciences. Psychology General aspects Pixels Population distributions Sample size Sampling bias Selection bias Spatial models Wildlife conservation
title	The effects of small sample size and sample bias on threshold selection and accuracy assessment of species distribution models
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T14%3A38%3A45IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20effects%20of%20small%20sample%20size%20and%20sample%20bias%20on%20threshold%20selection%20and%20accuracy%20assessment%20of%20species%20distribution%20models&rft.jtitle=Ecography%20(Copenhagen)&rft.au=Bean,%20William%20T.&rft.date=2012-03&rft.volume=35&rft.issue=3&rft.spage=250&rft.epage=258&rft.pages=250-258&rft.issn=0906-7590&rft.eissn=1600-0587&rft_id=info:doi/10.1111/j.1600-0587.2011.06545.x&rft_dat=%3Cjstor_proqu%3E41418661%3C/jstor_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1517177943&rft_id=info:pmid/&rft_jstor_id=41418661&rfr_iscdi=true