Explainable machine learning improves interpretability in the predictive modeling of biological stream conditions in the Chesapeake Bay Watershed, USA
Anthropogenic alterations have resulted in widespread degradation of stream conditions. To aid in stream restoration and management, baseline estimates of conditions and improved explanation of factors driving their degradation are needed. We used random forests to model biological conditions using...
Gespeichert in:
Veröffentlicht in: | Journal of environmental management 2022-11, Vol.322, p.116068-116068, Article 116068 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 116068 |
---|---|
container_issue | |
container_start_page | 116068 |
container_title | Journal of environmental management |
container_volume | 322 |
creator | Maloney, Kelly O. Buchanan, Claire Jepsen, Rikke D. Krause, Kevin P. Cashman, Matthew J. Gressler, Benjamin P. Young, John A. Schmid, Matthias |
description | Anthropogenic alterations have resulted in widespread degradation of stream conditions. To aid in stream restoration and management, baseline estimates of conditions and improved explanation of factors driving their degradation are needed. We used random forests to model biological conditions using a benthic macroinvertebrate index of biotic integrity for small, non-tidal streams (upstream area ≤200 km2) in the Chesapeake Bay watershed (CBW) of the mid-Atlantic coast of North America. We utilized several global and local model interpretation tools to improve average and site-specific model inferences, respectively. The model was used to predict condition for 95,867 individual catchments for eight periods (2001, 2004, 2006, 2008, 2011, 2013, 2016, 2019). Predicted conditions were classified as Poor, FairGood, or Uncertain to align with management needs and individual reach lengths and catchment areas were summed by condition class for the CBW for each period. Global permutation and local Shapley importance values indicated percent of forest, development, and agriculture in upstream catchments had strong impacts on predictions. Development and agriculture negatively influenced stream condition for model average (partial dependence [PD] and accumulated local effect [ALE] plots) and local (individual condition expectation and Shapley value plots) levels. Friedman's H-statistic indicated large overall interactions for these three land covers, and bivariate global plots (PD and ALE) supported interactions among agriculture and development. Total stream length and catchment area predicted in FairGood conditions decreased then increased over the 19-years (length/area: 66.6/65.4% in 2001, 66.3/65.2% in 2011, and 66.6/65.4% in 2019). Examination of individual catchment predictions between 2001 and 2019 showed those predicted to have the largest decreases in condition had large increases in development; whereas catchments predicted to exhibit the largest increases in condition showed moderate increases in forest cover. Use of global and local interpretative methods together with watershed-wide and individual catchment predictions support conservation practitioners that need to identify widespread and localized patterns, especially acknowledging that management actions typically take place at individual-reach scales.
[Display omitted]
•Restoration and management of streams need improved models and interpretability.•Random forests were used to predict stream condition |
doi_str_mv | 10.1016/j.jenvman.2022.116068 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2718378145</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0301479722016413</els_id><sourcerecordid>2709915128</sourcerecordid><originalsourceid>FETCH-LOGICAL-c422t-56a469022fd854d73a2064fbf2ecbe246d96d6853e1ad0871214150cd2aac2773</originalsourceid><addsrcrecordid>eNqNkUFv1DAQhS1EJZaWn4DkIwey2E7iJCdUVgUqVeJQKo7WxJ50vTh2sN1V94_09-LVtmc4jfT0vhm9eYS852zNGZefdusd-v0Mfi2YEGvOJZP9K7LibGirXtbsNVmxmvGq6YbuDXmb0o4xVgvercjT1ePiwHoYHdIZ9NZ6pA4heuvvqZ2XGPaYqPUZ4xIxw2idzYci0LxFWiRjdbb7AgeD7giFiY42uHBvNTiackSYqQ7e2GyDTy_oZosJFoTfSL_Agf6CciFt0Xykd7eXF-RsApfw3fM8J3dfr35uvlc3P75dby5vKt0IkatWQiOHEnoyfduYrgbBZDONk0A9omikGaSRfVsjB8P6jgve8JZpIwC06Lr6nHw47S05_zxgymq2SaNz4DE8JCU63tddz5v2P6xsGHjLRV-s7cmqY0gp4qSWaGeIB8WZOlamduq5MnWsTJ0qK9znE4cl8t5iVElb9Lr8OKLOygT7jw1_Aa_lpH4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2709915128</pqid></control><display><type>article</type><title>Explainable machine learning improves interpretability in the predictive modeling of biological stream conditions in the Chesapeake Bay Watershed, USA</title><source>Elsevier ScienceDirect Journals Complete</source><creator>Maloney, Kelly O. ; Buchanan, Claire ; Jepsen, Rikke D. ; Krause, Kevin P. ; Cashman, Matthew J. ; Gressler, Benjamin P. ; Young, John A. ; Schmid, Matthias</creator><creatorcontrib>Maloney, Kelly O. ; Buchanan, Claire ; Jepsen, Rikke D. ; Krause, Kevin P. ; Cashman, Matthew J. ; Gressler, Benjamin P. ; Young, John A. ; Schmid, Matthias</creatorcontrib><description>Anthropogenic alterations have resulted in widespread degradation of stream conditions. To aid in stream restoration and management, baseline estimates of conditions and improved explanation of factors driving their degradation are needed. We used random forests to model biological conditions using a benthic macroinvertebrate index of biotic integrity for small, non-tidal streams (upstream area ≤200 km2) in the Chesapeake Bay watershed (CBW) of the mid-Atlantic coast of North America. We utilized several global and local model interpretation tools to improve average and site-specific model inferences, respectively. The model was used to predict condition for 95,867 individual catchments for eight periods (2001, 2004, 2006, 2008, 2011, 2013, 2016, 2019). Predicted conditions were classified as Poor, FairGood, or Uncertain to align with management needs and individual reach lengths and catchment areas were summed by condition class for the CBW for each period. Global permutation and local Shapley importance values indicated percent of forest, development, and agriculture in upstream catchments had strong impacts on predictions. Development and agriculture negatively influenced stream condition for model average (partial dependence [PD] and accumulated local effect [ALE] plots) and local (individual condition expectation and Shapley value plots) levels. Friedman's H-statistic indicated large overall interactions for these three land covers, and bivariate global plots (PD and ALE) supported interactions among agriculture and development. Total stream length and catchment area predicted in FairGood conditions decreased then increased over the 19-years (length/area: 66.6/65.4% in 2001, 66.3/65.2% in 2011, and 66.6/65.4% in 2019). Examination of individual catchment predictions between 2001 and 2019 showed those predicted to have the largest decreases in condition had large increases in development; whereas catchments predicted to exhibit the largest increases in condition showed moderate increases in forest cover. Use of global and local interpretative methods together with watershed-wide and individual catchment predictions support conservation practitioners that need to identify widespread and localized patterns, especially acknowledging that management actions typically take place at individual-reach scales.
[Display omitted]
•Restoration and management of streams need improved models and interpretability.•Random forests were used to predict stream condition for multiple time periods.•Multiple global and local methods were used to improve interpretability.•Global and local methods supported each other but gave different insight.•Extent matters: little predicted change watershed-wide, but large in some catchments.</description><identifier>ISSN: 0301-4797</identifier><identifier>EISSN: 1095-8630</identifier><identifier>DOI: 10.1016/j.jenvman.2022.116068</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Benthic macroinvertebrates ; Chesapeake Bay ; coasts ; environmental management ; forests ; Individual conditional expectation plots ; Interpretable machine learning ; macroinvertebrates ; North America ; Partial dependence and accumulated local effect plots ; Random forests ; Shapley values ; streams ; watersheds</subject><ispartof>Journal of environmental management, 2022-11, Vol.322, p.116068-116068, Article 116068</ispartof><rights>2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c422t-56a469022fd854d73a2064fbf2ecbe246d96d6853e1ad0871214150cd2aac2773</citedby><cites>FETCH-LOGICAL-c422t-56a469022fd854d73a2064fbf2ecbe246d96d6853e1ad0871214150cd2aac2773</cites><orcidid>0000-0003-2132-2044 ; 0000-0002-6635-4309 ; 0000-0001-5627-448X ; 0000-0001-6639-8558</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0301479722016413$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids></links><search><creatorcontrib>Maloney, Kelly O.</creatorcontrib><creatorcontrib>Buchanan, Claire</creatorcontrib><creatorcontrib>Jepsen, Rikke D.</creatorcontrib><creatorcontrib>Krause, Kevin P.</creatorcontrib><creatorcontrib>Cashman, Matthew J.</creatorcontrib><creatorcontrib>Gressler, Benjamin P.</creatorcontrib><creatorcontrib>Young, John A.</creatorcontrib><creatorcontrib>Schmid, Matthias</creatorcontrib><title>Explainable machine learning improves interpretability in the predictive modeling of biological stream conditions in the Chesapeake Bay Watershed, USA</title><title>Journal of environmental management</title><description>Anthropogenic alterations have resulted in widespread degradation of stream conditions. To aid in stream restoration and management, baseline estimates of conditions and improved explanation of factors driving their degradation are needed. We used random forests to model biological conditions using a benthic macroinvertebrate index of biotic integrity for small, non-tidal streams (upstream area ≤200 km2) in the Chesapeake Bay watershed (CBW) of the mid-Atlantic coast of North America. We utilized several global and local model interpretation tools to improve average and site-specific model inferences, respectively. The model was used to predict condition for 95,867 individual catchments for eight periods (2001, 2004, 2006, 2008, 2011, 2013, 2016, 2019). Predicted conditions were classified as Poor, FairGood, or Uncertain to align with management needs and individual reach lengths and catchment areas were summed by condition class for the CBW for each period. Global permutation and local Shapley importance values indicated percent of forest, development, and agriculture in upstream catchments had strong impacts on predictions. Development and agriculture negatively influenced stream condition for model average (partial dependence [PD] and accumulated local effect [ALE] plots) and local (individual condition expectation and Shapley value plots) levels. Friedman's H-statistic indicated large overall interactions for these three land covers, and bivariate global plots (PD and ALE) supported interactions among agriculture and development. Total stream length and catchment area predicted in FairGood conditions decreased then increased over the 19-years (length/area: 66.6/65.4% in 2001, 66.3/65.2% in 2011, and 66.6/65.4% in 2019). Examination of individual catchment predictions between 2001 and 2019 showed those predicted to have the largest decreases in condition had large increases in development; whereas catchments predicted to exhibit the largest increases in condition showed moderate increases in forest cover. Use of global and local interpretative methods together with watershed-wide and individual catchment predictions support conservation practitioners that need to identify widespread and localized patterns, especially acknowledging that management actions typically take place at individual-reach scales.
[Display omitted]
•Restoration and management of streams need improved models and interpretability.•Random forests were used to predict stream condition for multiple time periods.•Multiple global and local methods were used to improve interpretability.•Global and local methods supported each other but gave different insight.•Extent matters: little predicted change watershed-wide, but large in some catchments.</description><subject>Benthic macroinvertebrates</subject><subject>Chesapeake Bay</subject><subject>coasts</subject><subject>environmental management</subject><subject>forests</subject><subject>Individual conditional expectation plots</subject><subject>Interpretable machine learning</subject><subject>macroinvertebrates</subject><subject>North America</subject><subject>Partial dependence and accumulated local effect plots</subject><subject>Random forests</subject><subject>Shapley values</subject><subject>streams</subject><subject>watersheds</subject><issn>0301-4797</issn><issn>1095-8630</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNqNkUFv1DAQhS1EJZaWn4DkIwey2E7iJCdUVgUqVeJQKo7WxJ50vTh2sN1V94_09-LVtmc4jfT0vhm9eYS852zNGZefdusd-v0Mfi2YEGvOJZP9K7LibGirXtbsNVmxmvGq6YbuDXmb0o4xVgvercjT1ePiwHoYHdIZ9NZ6pA4heuvvqZ2XGPaYqPUZ4xIxw2idzYci0LxFWiRjdbb7AgeD7giFiY42uHBvNTiackSYqQ7e2GyDTy_oZosJFoTfSL_Agf6CciFt0Xykd7eXF-RsApfw3fM8J3dfr35uvlc3P75dby5vKt0IkatWQiOHEnoyfduYrgbBZDONk0A9omikGaSRfVsjB8P6jgve8JZpIwC06Lr6nHw47S05_zxgymq2SaNz4DE8JCU63tddz5v2P6xsGHjLRV-s7cmqY0gp4qSWaGeIB8WZOlamduq5MnWsTJ0qK9znE4cl8t5iVElb9Lr8OKLOygT7jw1_Aa_lpH4</recordid><startdate>20221115</startdate><enddate>20221115</enddate><creator>Maloney, Kelly O.</creator><creator>Buchanan, Claire</creator><creator>Jepsen, Rikke D.</creator><creator>Krause, Kevin P.</creator><creator>Cashman, Matthew J.</creator><creator>Gressler, Benjamin P.</creator><creator>Young, John A.</creator><creator>Schmid, Matthias</creator><general>Elsevier Ltd</general><scope>6I.</scope><scope>AAFTH</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7S9</scope><scope>L.6</scope><orcidid>https://orcid.org/0000-0003-2132-2044</orcidid><orcidid>https://orcid.org/0000-0002-6635-4309</orcidid><orcidid>https://orcid.org/0000-0001-5627-448X</orcidid><orcidid>https://orcid.org/0000-0001-6639-8558</orcidid></search><sort><creationdate>20221115</creationdate><title>Explainable machine learning improves interpretability in the predictive modeling of biological stream conditions in the Chesapeake Bay Watershed, USA</title><author>Maloney, Kelly O. ; Buchanan, Claire ; Jepsen, Rikke D. ; Krause, Kevin P. ; Cashman, Matthew J. ; Gressler, Benjamin P. ; Young, John A. ; Schmid, Matthias</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c422t-56a469022fd854d73a2064fbf2ecbe246d96d6853e1ad0871214150cd2aac2773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Benthic macroinvertebrates</topic><topic>Chesapeake Bay</topic><topic>coasts</topic><topic>environmental management</topic><topic>forests</topic><topic>Individual conditional expectation plots</topic><topic>Interpretable machine learning</topic><topic>macroinvertebrates</topic><topic>North America</topic><topic>Partial dependence and accumulated local effect plots</topic><topic>Random forests</topic><topic>Shapley values</topic><topic>streams</topic><topic>watersheds</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Maloney, Kelly O.</creatorcontrib><creatorcontrib>Buchanan, Claire</creatorcontrib><creatorcontrib>Jepsen, Rikke D.</creatorcontrib><creatorcontrib>Krause, Kevin P.</creatorcontrib><creatorcontrib>Cashman, Matthew J.</creatorcontrib><creatorcontrib>Gressler, Benjamin P.</creatorcontrib><creatorcontrib>Young, John A.</creatorcontrib><creatorcontrib>Schmid, Matthias</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>AGRICOLA</collection><collection>AGRICOLA - Academic</collection><jtitle>Journal of environmental management</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Maloney, Kelly O.</au><au>Buchanan, Claire</au><au>Jepsen, Rikke D.</au><au>Krause, Kevin P.</au><au>Cashman, Matthew J.</au><au>Gressler, Benjamin P.</au><au>Young, John A.</au><au>Schmid, Matthias</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Explainable machine learning improves interpretability in the predictive modeling of biological stream conditions in the Chesapeake Bay Watershed, USA</atitle><jtitle>Journal of environmental management</jtitle><date>2022-11-15</date><risdate>2022</risdate><volume>322</volume><spage>116068</spage><epage>116068</epage><pages>116068-116068</pages><artnum>116068</artnum><issn>0301-4797</issn><eissn>1095-8630</eissn><abstract>Anthropogenic alterations have resulted in widespread degradation of stream conditions. To aid in stream restoration and management, baseline estimates of conditions and improved explanation of factors driving their degradation are needed. We used random forests to model biological conditions using a benthic macroinvertebrate index of biotic integrity for small, non-tidal streams (upstream area ≤200 km2) in the Chesapeake Bay watershed (CBW) of the mid-Atlantic coast of North America. We utilized several global and local model interpretation tools to improve average and site-specific model inferences, respectively. The model was used to predict condition for 95,867 individual catchments for eight periods (2001, 2004, 2006, 2008, 2011, 2013, 2016, 2019). Predicted conditions were classified as Poor, FairGood, or Uncertain to align with management needs and individual reach lengths and catchment areas were summed by condition class for the CBW for each period. Global permutation and local Shapley importance values indicated percent of forest, development, and agriculture in upstream catchments had strong impacts on predictions. Development and agriculture negatively influenced stream condition for model average (partial dependence [PD] and accumulated local effect [ALE] plots) and local (individual condition expectation and Shapley value plots) levels. Friedman's H-statistic indicated large overall interactions for these three land covers, and bivariate global plots (PD and ALE) supported interactions among agriculture and development. Total stream length and catchment area predicted in FairGood conditions decreased then increased over the 19-years (length/area: 66.6/65.4% in 2001, 66.3/65.2% in 2011, and 66.6/65.4% in 2019). Examination of individual catchment predictions between 2001 and 2019 showed those predicted to have the largest decreases in condition had large increases in development; whereas catchments predicted to exhibit the largest increases in condition showed moderate increases in forest cover. Use of global and local interpretative methods together with watershed-wide and individual catchment predictions support conservation practitioners that need to identify widespread and localized patterns, especially acknowledging that management actions typically take place at individual-reach scales.
[Display omitted]
•Restoration and management of streams need improved models and interpretability.•Random forests were used to predict stream condition for multiple time periods.•Multiple global and local methods were used to improve interpretability.•Global and local methods supported each other but gave different insight.•Extent matters: little predicted change watershed-wide, but large in some catchments.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.jenvman.2022.116068</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0003-2132-2044</orcidid><orcidid>https://orcid.org/0000-0002-6635-4309</orcidid><orcidid>https://orcid.org/0000-0001-5627-448X</orcidid><orcidid>https://orcid.org/0000-0001-6639-8558</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0301-4797 |
ispartof | Journal of environmental management, 2022-11, Vol.322, p.116068-116068, Article 116068 |
issn | 0301-4797 1095-8630 |
language | eng |
recordid | cdi_proquest_miscellaneous_2718378145 |
source | Elsevier ScienceDirect Journals Complete |
subjects | Benthic macroinvertebrates Chesapeake Bay coasts environmental management forests Individual conditional expectation plots Interpretable machine learning macroinvertebrates North America Partial dependence and accumulated local effect plots Random forests Shapley values streams watersheds |
title | Explainable machine learning improves interpretability in the predictive modeling of biological stream conditions in the Chesapeake Bay Watershed, USA |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-12T02%3A27%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Explainable%20machine%20learning%20improves%20interpretability%20in%20the%20predictive%20modeling%20of%20biological%20stream%20conditions%20in%20the%20Chesapeake%20Bay%20Watershed,%20USA&rft.jtitle=Journal%20of%20environmental%20management&rft.au=Maloney,%20Kelly%20O.&rft.date=2022-11-15&rft.volume=322&rft.spage=116068&rft.epage=116068&rft.pages=116068-116068&rft.artnum=116068&rft.issn=0301-4797&rft.eissn=1095-8630&rft_id=info:doi/10.1016/j.jenvman.2022.116068&rft_dat=%3Cproquest_cross%3E2709915128%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2709915128&rft_id=info:pmid/&rft_els_id=S0301479722016413&rfr_iscdi=true |