Coupling mobile phone data with machine learning: How misclassification errors in ambient PM2.5 exposure estimates are produced?

Most studies relying on time-activity diary or traditional air pollution modelling approach are insufficient to suggest the impacts of ignoring individual mobility and air pollution variations on misclassification errors in exposure estimates. Moreover, very few studies have examined whether such im...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Science of the total environment 2020-11, Vol.745, p.141034-141034, Article 141034
Hauptverfasser: Guo, Huagui, Zhan, Qingming, Ho, Hung Chak, Yao, Fei, Zhou, Xingang, Wu, Jiansheng, Li, Weifeng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 141034
container_issue
container_start_page 141034
container_title The Science of the total environment
container_volume 745
creator Guo, Huagui
Zhan, Qingming
Ho, Hung Chak
Yao, Fei
Zhou, Xingang
Wu, Jiansheng
Li, Weifeng
description Most studies relying on time-activity diary or traditional air pollution modelling approach are insufficient to suggest the impacts of ignoring individual mobility and air pollution variations on misclassification errors in exposure estimates. Moreover, very few studies have examined whether such impacts differ across socioeconomic groups. We aim to examine how ignoring individual mobility and PM2.5 variations produces misclassification errors in ambient PM2.5 exposure estimates. We developed a geo-informed backward propagation neural network model to estimate hourly PM2.5 concentrations in terms of remote sensing and geospatial big data. Combining the estimated PM2.5 concentrations and individual trajectories derived from 755,468 mobile phone users on a weekday in Shenzhen, China, we estimated four types of individual total PM2.5 exposures during weekdays at multi-temporal scales. The estimate ignoring individual mobility, PM2.5 variations or both was compared with the hypothetical error-free estimate using paired sample t-test. We then quantified the exposure misclassification error using Pearson correlation analysis. Moreover, we examined whether the misclassification error differs across different socioeconomic groups. Taking findings of ignoring individual mobility as an example, we further investigated whether such findings are robust to the different selections of time. We found that the estimate ignoring PM2.5 variations, individual mobility or both was statistically different from the hypothetical error-free estimate. Ignoring both factors produced the largest exposure misclassification error. The misclassification error was larger in the estimate ignoring PM2.5 variations than that ignoring individual mobility. People with high economic status suffered from a larger exposure misclassification error. The findings were robust to the different selections of time. Ignoring individual mobility, PM2.5 variations or both leads to misclassification errors in ambient PM2.5 exposure estimates. A larger misclassification error occurs in the estimate neglecting PM2.5 variations than that ignoring individual mobility, which is seldom reported before. [Display omitted] •First study coupling mobile phone location data and machine learning approach to examine exposure misclassification errors.•Ignoring individual mobility and PM2.5 variations leads to misclassification errors.•A larger misclassification error in the estimate neglecting PM2.5 variations than that
doi_str_mv 10.1016/j.scitotenv.2020.141034
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2431814171</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0048969720345630</els_id><sourcerecordid>2431814171</sourcerecordid><originalsourceid>FETCH-LOGICAL-c348t-ee62332761e4ddf8912768d67df88c49de762e05700e1bd03b10e47dbcebc5943</originalsourceid><addsrcrecordid>eNqFkMFOwzAMhiMEEmPwDOTIpSNps6blgqYJGNIQHOAcpYnHMrVJSdINbjw6mYa44osd67fj_0PokpIJJbS83kyCMtFFsNtJTvLUZZQU7AiNaMXrjJK8PEYjQliV1WXNT9FZCBuSgld0hL7nbuhbY99x5xrTAu7XzgLWMkq8M3GNO6nWJnVakN4m3Q1euB3uTFCtDMGsjJLROIvBe-cDNhbLrjFgI355yidTDJ-9C4MHDCGaTkYIWKZX750eFOjbc3Sykm2Ai988Rm_3d6_zRbZ8fnicz5aZKlgVM4AyL4qclxSY1quqpqmudMlTXSlWa-BlDmTKCQHaaFI0lADjulHQqGnNijG6OuxNP38M6Rix9wBtKy24IYicFbRK6DhNUn6QKu9C8LASvU-n-y9BidgzFxvxx1zsmYsD8zQ5O0xCcrI14Pc6sMmn8aCi0M78u-MHFweRIw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2431814171</pqid></control><display><type>article</type><title>Coupling mobile phone data with machine learning: How misclassification errors in ambient PM2.5 exposure estimates are produced?</title><source>Elsevier ScienceDirect Journals</source><creator>Guo, Huagui ; Zhan, Qingming ; Ho, Hung Chak ; Yao, Fei ; Zhou, Xingang ; Wu, Jiansheng ; Li, Weifeng</creator><creatorcontrib>Guo, Huagui ; Zhan, Qingming ; Ho, Hung Chak ; Yao, Fei ; Zhou, Xingang ; Wu, Jiansheng ; Li, Weifeng</creatorcontrib><description>Most studies relying on time-activity diary or traditional air pollution modelling approach are insufficient to suggest the impacts of ignoring individual mobility and air pollution variations on misclassification errors in exposure estimates. Moreover, very few studies have examined whether such impacts differ across socioeconomic groups. We aim to examine how ignoring individual mobility and PM2.5 variations produces misclassification errors in ambient PM2.5 exposure estimates. We developed a geo-informed backward propagation neural network model to estimate hourly PM2.5 concentrations in terms of remote sensing and geospatial big data. Combining the estimated PM2.5 concentrations and individual trajectories derived from 755,468 mobile phone users on a weekday in Shenzhen, China, we estimated four types of individual total PM2.5 exposures during weekdays at multi-temporal scales. The estimate ignoring individual mobility, PM2.5 variations or both was compared with the hypothetical error-free estimate using paired sample t-test. We then quantified the exposure misclassification error using Pearson correlation analysis. Moreover, we examined whether the misclassification error differs across different socioeconomic groups. Taking findings of ignoring individual mobility as an example, we further investigated whether such findings are robust to the different selections of time. We found that the estimate ignoring PM2.5 variations, individual mobility or both was statistically different from the hypothetical error-free estimate. Ignoring both factors produced the largest exposure misclassification error. The misclassification error was larger in the estimate ignoring PM2.5 variations than that ignoring individual mobility. People with high economic status suffered from a larger exposure misclassification error. The findings were robust to the different selections of time. Ignoring individual mobility, PM2.5 variations or both leads to misclassification errors in ambient PM2.5 exposure estimates. A larger misclassification error occurs in the estimate neglecting PM2.5 variations than that ignoring individual mobility, which is seldom reported before. [Display omitted] •First study coupling mobile phone location data and machine learning approach to examine exposure misclassification errors.•Ignoring individual mobility and PM2.5 variations leads to misclassification errors.•A larger misclassification error in the estimate neglecting PM2.5 variations than that ignoring individual mobility•High economic-status group suffer from a larger misclassification error.</description><identifier>ISSN: 0048-9697</identifier><identifier>EISSN: 1879-1026</identifier><identifier>DOI: 10.1016/j.scitotenv.2020.141034</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Machine learning ; Misclassification errors ; Mobile phone location data ; PM2.5 exposure estimate</subject><ispartof>The Science of the total environment, 2020-11, Vol.745, p.141034-141034, Article 141034</ispartof><rights>2020 Elsevier B.V.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c348t-ee62332761e4ddf8912768d67df88c49de762e05700e1bd03b10e47dbcebc5943</citedby><cites>FETCH-LOGICAL-c348t-ee62332761e4ddf8912768d67df88c49de762e05700e1bd03b10e47dbcebc5943</cites><orcidid>0000-0002-8327-3252 ; 0000-0001-5386-3019 ; 0000-0002-6505-3504</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S0048969720345630$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,776,780,3537,27901,27902,65306</link.rule.ids></links><search><creatorcontrib>Guo, Huagui</creatorcontrib><creatorcontrib>Zhan, Qingming</creatorcontrib><creatorcontrib>Ho, Hung Chak</creatorcontrib><creatorcontrib>Yao, Fei</creatorcontrib><creatorcontrib>Zhou, Xingang</creatorcontrib><creatorcontrib>Wu, Jiansheng</creatorcontrib><creatorcontrib>Li, Weifeng</creatorcontrib><title>Coupling mobile phone data with machine learning: How misclassification errors in ambient PM2.5 exposure estimates are produced?</title><title>The Science of the total environment</title><description>Most studies relying on time-activity diary or traditional air pollution modelling approach are insufficient to suggest the impacts of ignoring individual mobility and air pollution variations on misclassification errors in exposure estimates. Moreover, very few studies have examined whether such impacts differ across socioeconomic groups. We aim to examine how ignoring individual mobility and PM2.5 variations produces misclassification errors in ambient PM2.5 exposure estimates. We developed a geo-informed backward propagation neural network model to estimate hourly PM2.5 concentrations in terms of remote sensing and geospatial big data. Combining the estimated PM2.5 concentrations and individual trajectories derived from 755,468 mobile phone users on a weekday in Shenzhen, China, we estimated four types of individual total PM2.5 exposures during weekdays at multi-temporal scales. The estimate ignoring individual mobility, PM2.5 variations or both was compared with the hypothetical error-free estimate using paired sample t-test. We then quantified the exposure misclassification error using Pearson correlation analysis. Moreover, we examined whether the misclassification error differs across different socioeconomic groups. Taking findings of ignoring individual mobility as an example, we further investigated whether such findings are robust to the different selections of time. We found that the estimate ignoring PM2.5 variations, individual mobility or both was statistically different from the hypothetical error-free estimate. Ignoring both factors produced the largest exposure misclassification error. The misclassification error was larger in the estimate ignoring PM2.5 variations than that ignoring individual mobility. People with high economic status suffered from a larger exposure misclassification error. The findings were robust to the different selections of time. Ignoring individual mobility, PM2.5 variations or both leads to misclassification errors in ambient PM2.5 exposure estimates. A larger misclassification error occurs in the estimate neglecting PM2.5 variations than that ignoring individual mobility, which is seldom reported before. [Display omitted] •First study coupling mobile phone location data and machine learning approach to examine exposure misclassification errors.•Ignoring individual mobility and PM2.5 variations leads to misclassification errors.•A larger misclassification error in the estimate neglecting PM2.5 variations than that ignoring individual mobility•High economic-status group suffer from a larger misclassification error.</description><subject>Machine learning</subject><subject>Misclassification errors</subject><subject>Mobile phone location data</subject><subject>PM2.5 exposure estimate</subject><issn>0048-9697</issn><issn>1879-1026</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNqFkMFOwzAMhiMEEmPwDOTIpSNps6blgqYJGNIQHOAcpYnHMrVJSdINbjw6mYa44osd67fj_0PokpIJJbS83kyCMtFFsNtJTvLUZZQU7AiNaMXrjJK8PEYjQliV1WXNT9FZCBuSgld0hL7nbuhbY99x5xrTAu7XzgLWMkq8M3GNO6nWJnVakN4m3Q1euB3uTFCtDMGsjJLROIvBe-cDNhbLrjFgI355yidTDJ-9C4MHDCGaTkYIWKZX750eFOjbc3Sykm2Ai988Rm_3d6_zRbZ8fnicz5aZKlgVM4AyL4qclxSY1quqpqmudMlTXSlWa-BlDmTKCQHaaFI0lADjulHQqGnNijG6OuxNP38M6Rix9wBtKy24IYicFbRK6DhNUn6QKu9C8LASvU-n-y9BidgzFxvxx1zsmYsD8zQ5O0xCcrI14Pc6sMmn8aCi0M78u-MHFweRIw</recordid><startdate>20201125</startdate><enddate>20201125</enddate><creator>Guo, Huagui</creator><creator>Zhan, Qingming</creator><creator>Ho, Hung Chak</creator><creator>Yao, Fei</creator><creator>Zhou, Xingang</creator><creator>Wu, Jiansheng</creator><creator>Li, Weifeng</creator><general>Elsevier B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-8327-3252</orcidid><orcidid>https://orcid.org/0000-0001-5386-3019</orcidid><orcidid>https://orcid.org/0000-0002-6505-3504</orcidid></search><sort><creationdate>20201125</creationdate><title>Coupling mobile phone data with machine learning: How misclassification errors in ambient PM2.5 exposure estimates are produced?</title><author>Guo, Huagui ; Zhan, Qingming ; Ho, Hung Chak ; Yao, Fei ; Zhou, Xingang ; Wu, Jiansheng ; Li, Weifeng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c348t-ee62332761e4ddf8912768d67df88c49de762e05700e1bd03b10e47dbcebc5943</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Machine learning</topic><topic>Misclassification errors</topic><topic>Mobile phone location data</topic><topic>PM2.5 exposure estimate</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Guo, Huagui</creatorcontrib><creatorcontrib>Zhan, Qingming</creatorcontrib><creatorcontrib>Ho, Hung Chak</creatorcontrib><creatorcontrib>Yao, Fei</creatorcontrib><creatorcontrib>Zhou, Xingang</creatorcontrib><creatorcontrib>Wu, Jiansheng</creatorcontrib><creatorcontrib>Li, Weifeng</creatorcontrib><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>The Science of the total environment</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Guo, Huagui</au><au>Zhan, Qingming</au><au>Ho, Hung Chak</au><au>Yao, Fei</au><au>Zhou, Xingang</au><au>Wu, Jiansheng</au><au>Li, Weifeng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Coupling mobile phone data with machine learning: How misclassification errors in ambient PM2.5 exposure estimates are produced?</atitle><jtitle>The Science of the total environment</jtitle><date>2020-11-25</date><risdate>2020</risdate><volume>745</volume><spage>141034</spage><epage>141034</epage><pages>141034-141034</pages><artnum>141034</artnum><issn>0048-9697</issn><eissn>1879-1026</eissn><abstract>Most studies relying on time-activity diary or traditional air pollution modelling approach are insufficient to suggest the impacts of ignoring individual mobility and air pollution variations on misclassification errors in exposure estimates. Moreover, very few studies have examined whether such impacts differ across socioeconomic groups. We aim to examine how ignoring individual mobility and PM2.5 variations produces misclassification errors in ambient PM2.5 exposure estimates. We developed a geo-informed backward propagation neural network model to estimate hourly PM2.5 concentrations in terms of remote sensing and geospatial big data. Combining the estimated PM2.5 concentrations and individual trajectories derived from 755,468 mobile phone users on a weekday in Shenzhen, China, we estimated four types of individual total PM2.5 exposures during weekdays at multi-temporal scales. The estimate ignoring individual mobility, PM2.5 variations or both was compared with the hypothetical error-free estimate using paired sample t-test. We then quantified the exposure misclassification error using Pearson correlation analysis. Moreover, we examined whether the misclassification error differs across different socioeconomic groups. Taking findings of ignoring individual mobility as an example, we further investigated whether such findings are robust to the different selections of time. We found that the estimate ignoring PM2.5 variations, individual mobility or both was statistically different from the hypothetical error-free estimate. Ignoring both factors produced the largest exposure misclassification error. The misclassification error was larger in the estimate ignoring PM2.5 variations than that ignoring individual mobility. People with high economic status suffered from a larger exposure misclassification error. The findings were robust to the different selections of time. Ignoring individual mobility, PM2.5 variations or both leads to misclassification errors in ambient PM2.5 exposure estimates. A larger misclassification error occurs in the estimate neglecting PM2.5 variations than that ignoring individual mobility, which is seldom reported before. [Display omitted] •First study coupling mobile phone location data and machine learning approach to examine exposure misclassification errors.•Ignoring individual mobility and PM2.5 variations leads to misclassification errors.•A larger misclassification error in the estimate neglecting PM2.5 variations than that ignoring individual mobility•High economic-status group suffer from a larger misclassification error.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.scitotenv.2020.141034</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-8327-3252</orcidid><orcidid>https://orcid.org/0000-0001-5386-3019</orcidid><orcidid>https://orcid.org/0000-0002-6505-3504</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0048-9697
ispartof The Science of the total environment, 2020-11, Vol.745, p.141034-141034, Article 141034
issn 0048-9697
1879-1026
language eng
recordid cdi_proquest_miscellaneous_2431814171
source Elsevier ScienceDirect Journals
subjects Machine learning
Misclassification errors
Mobile phone location data
PM2.5 exposure estimate
title Coupling mobile phone data with machine learning: How misclassification errors in ambient PM2.5 exposure estimates are produced?
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T07%3A02%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Coupling%20mobile%20phone%20data%20with%20machine%20learning:%20How%20misclassification%20errors%20in%20ambient%20PM2.5%20exposure%20estimates%20are%20produced?&rft.jtitle=The%20Science%20of%20the%20total%20environment&rft.au=Guo,%20Huagui&rft.date=2020-11-25&rft.volume=745&rft.spage=141034&rft.epage=141034&rft.pages=141034-141034&rft.artnum=141034&rft.issn=0048-9697&rft.eissn=1879-1026&rft_id=info:doi/10.1016/j.scitotenv.2020.141034&rft_dat=%3Cproquest_cross%3E2431814171%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2431814171&rft_id=info:pmid/&rft_els_id=S0048969720345630&rfr_iscdi=true