Seoul bike trip duration prediction using data mining techniques
Trip duration is the most fundamental measure in all modes of transportation. Hence, it is crucial to predict the trip-time precisely for the advancement of Intelligent Transport Systems and traveller information systems. To predict the trip duration, data mining techniques are employed in this stud...
Gespeichert in:
Veröffentlicht in: | IET intelligent transport systems 2020-11, Vol.14 (11), p.1465-1474 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1474 |
---|---|
container_issue | 11 |
container_start_page | 1465 |
container_title | IET intelligent transport systems |
container_volume | 14 |
creator | V E, Sathishkumar Park, Jangwoo Cho, Yongyun |
description | Trip duration is the most fundamental measure in all modes of transportation. Hence, it is crucial to predict the trip-time precisely for the advancement of Intelligent Transport Systems and traveller information systems. To predict the trip duration, data mining techniques are employed in this study to predict the trip duration of rental bikes in Seoul Bike sharing system. The prediction is carried out with the combination of Seoul Bike data and weather data. The data used include trip duration, trip distance, pickup and dropoff latitude and longitude, temperature, precipitation, wind speed, humidity, solar radiation, snowfall, ground temperature and 1-hour average dust concentration. Feature engineering is done to extract additional features from the data. Four statistical models are used to predict the trip duration. (a) Linear regression, (b) Gradient boosting machines, (c) k nearest neighbour and (d) Random Forest (RF). Four performance metrics root mean squared error, coefficient of variance, mean absolute error and median absolute error is used to determine the efficiency of the models. In comparison with the other models, the best model RF can explain the variance of 93% in the testing set and 98% (R2) in the training set. The outcome proves that RF is effective to be employed for the prediction of trip duration. |
doi_str_mv | 10.1049/iet-its.2019.0796 |
format | Article |
fullrecord | <record><control><sourceid>wiley_24P</sourceid><recordid>TN_cdi_webofscience_primary_000591879700016</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_1612ac844d5d406cbbc09c731b855b56</doaj_id><sourcerecordid>ITR2BF00955</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4005-4b198088f81c9e44fe1e1fc4355aa7850c64ec4733fb9b0d009f1f68f789e403</originalsourceid><addsrcrecordid>eNqNkc1uEzEUhUcIJErhAdjNHk3wnfEvKyBqIVIlJMiCnWV7rotDOg62R1XfHk9SZVlY-cg63_G9x03zFsgKCFXvA5YulLzqCagVEYo_ay5AMOgUE_L5WfOfL5tXOe8IYbzv4aL5-APjvG9t-I1tSeHQjnMyJcSpPSQcgzvKOYfpth1NMe1dmBZd0P2awp8Z8-vmhTf7jG8ez8tme321XX_tbr592aw_3XSO1sc6akFJIqWX4BRS6hEQvKMDY8YIyYjjFB0Vw-CtsmQkRHnwXHohq50Ml83mFDtGs9OHFO5MetDRBH28iOlWm1SC26MGDr1xktKRjZRwZ60jyokBrGTMMl6z4JTlUsw5oT_nAdFLm7q2qWubemlTL21WRp6Ye7TRZxdwcnjmSF1RgRRKVAV8Hcqxw3Wcp1LRd_-PVveHR3fY48O_J9Ob7ff-83Xti7EKdyd4se3inKb6JU9s9RcKsqtS</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Seoul bike trip duration prediction using data mining techniques</title><source>Wiley-Blackwell Open Access Titles</source><creator>V E, Sathishkumar ; Park, Jangwoo ; Cho, Yongyun</creator><creatorcontrib>V E, Sathishkumar ; Park, Jangwoo ; Cho, Yongyun</creatorcontrib><description>Trip duration is the most fundamental measure in all modes of transportation. Hence, it is crucial to predict the trip-time precisely for the advancement of Intelligent Transport Systems and traveller information systems. To predict the trip duration, data mining techniques are employed in this study to predict the trip duration of rental bikes in Seoul Bike sharing system. The prediction is carried out with the combination of Seoul Bike data and weather data. The data used include trip duration, trip distance, pickup and dropoff latitude and longitude, temperature, precipitation, wind speed, humidity, solar radiation, snowfall, ground temperature and 1-hour average dust concentration. Feature engineering is done to extract additional features from the data. Four statistical models are used to predict the trip duration. (a) Linear regression, (b) Gradient boosting machines, (c) k nearest neighbour and (d) Random Forest (RF). Four performance metrics root mean squared error, coefficient of variance, mean absolute error and median absolute error is used to determine the efficiency of the models. In comparison with the other models, the best model RF can explain the variance of 93% in the testing set and 98% (R2) in the training set. The outcome proves that RF is effective to be employed for the prediction of trip duration.</description><identifier>ISSN: 1751-956X</identifier><identifier>EISSN: 1751-9578</identifier><identifier>DOI: 10.1049/iet-its.2019.0796</identifier><language>eng</language><publisher>HOBOKEN: The Institution of Engineering and Technology</publisher><subject>coefficient of variance ; data mining ; data mining techniques ; Engineering ; Engineering, Electrical & Electronic ; feature engineering ; feature extraction ; gradient boosting machines ; intelligent transport systems ; intelligent transportation systems ; k nearest neighbour ; linear regression ; mean absolute error ; mean square error methods ; median absolute error ; nearest neighbour methods ; Random Forest ; random forests ; regression analysis ; rental bikes ; root mean squared error ; Science & Technology ; Seoul bike data ; Seoul bike sharing system ; Seoul bike trip duration prediction ; Special Issue: Intelligent Transportation Systems in Smart Cities for Sustainable Environments ; statistical models ; Technology ; traffic information systems ; Transportation ; Transportation Science & Technology ; traveller information systems ; trip distance ; trip‐time prediction</subject><ispartof>IET intelligent transport systems, 2020-11, Vol.14 (11), p.1465-1474</ispartof><rights>The Institution of Engineering and Technology</rights><rights>2020 The Institution of Engineering and Technology</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>true</woscitedreferencessubscribed><woscitedreferencescount>16</woscitedreferencescount><woscitedreferencesoriginalsourcerecordid>wos000591879700016</woscitedreferencesoriginalsourcerecordid><citedby>FETCH-LOGICAL-c4005-4b198088f81c9e44fe1e1fc4355aa7850c64ec4733fb9b0d009f1f68f789e403</citedby><cites>FETCH-LOGICAL-c4005-4b198088f81c9e44fe1e1fc4355aa7850c64ec4733fb9b0d009f1f68f789e403</cites><orcidid>0000-0002-8271-2022</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1049%2Fiet-its.2019.0796$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1049%2Fiet-its.2019.0796$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,780,784,1416,11561,27923,27924,45573,45574,46051,46475</link.rule.ids><linktorsrc>$$Uhttps://onlinelibrary.wiley.com/doi/abs/10.1049%2Fiet-its.2019.0796$$EView_record_in_Wiley-Blackwell$$FView_record_in_$$GWiley-Blackwell</linktorsrc></links><search><creatorcontrib>V E, Sathishkumar</creatorcontrib><creatorcontrib>Park, Jangwoo</creatorcontrib><creatorcontrib>Cho, Yongyun</creatorcontrib><title>Seoul bike trip duration prediction using data mining techniques</title><title>IET intelligent transport systems</title><addtitle>IET INTELL TRANSP SY</addtitle><description>Trip duration is the most fundamental measure in all modes of transportation. Hence, it is crucial to predict the trip-time precisely for the advancement of Intelligent Transport Systems and traveller information systems. To predict the trip duration, data mining techniques are employed in this study to predict the trip duration of rental bikes in Seoul Bike sharing system. The prediction is carried out with the combination of Seoul Bike data and weather data. The data used include trip duration, trip distance, pickup and dropoff latitude and longitude, temperature, precipitation, wind speed, humidity, solar radiation, snowfall, ground temperature and 1-hour average dust concentration. Feature engineering is done to extract additional features from the data. Four statistical models are used to predict the trip duration. (a) Linear regression, (b) Gradient boosting machines, (c) k nearest neighbour and (d) Random Forest (RF). Four performance metrics root mean squared error, coefficient of variance, mean absolute error and median absolute error is used to determine the efficiency of the models. In comparison with the other models, the best model RF can explain the variance of 93% in the testing set and 98% (R2) in the training set. The outcome proves that RF is effective to be employed for the prediction of trip duration.</description><subject>coefficient of variance</subject><subject>data mining</subject><subject>data mining techniques</subject><subject>Engineering</subject><subject>Engineering, Electrical & Electronic</subject><subject>feature engineering</subject><subject>feature extraction</subject><subject>gradient boosting machines</subject><subject>intelligent transport systems</subject><subject>intelligent transportation systems</subject><subject>k nearest neighbour</subject><subject>linear regression</subject><subject>mean absolute error</subject><subject>mean square error methods</subject><subject>median absolute error</subject><subject>nearest neighbour methods</subject><subject>Random Forest</subject><subject>random forests</subject><subject>regression analysis</subject><subject>rental bikes</subject><subject>root mean squared error</subject><subject>Science & Technology</subject><subject>Seoul bike data</subject><subject>Seoul bike sharing system</subject><subject>Seoul bike trip duration prediction</subject><subject>Special Issue: Intelligent Transportation Systems in Smart Cities for Sustainable Environments</subject><subject>statistical models</subject><subject>Technology</subject><subject>traffic information systems</subject><subject>Transportation</subject><subject>Transportation Science & Technology</subject><subject>traveller information systems</subject><subject>trip distance</subject><subject>trip‐time prediction</subject><issn>1751-956X</issn><issn>1751-9578</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>AOWDO</sourceid><sourceid>DOA</sourceid><recordid>eNqNkc1uEzEUhUcIJErhAdjNHk3wnfEvKyBqIVIlJMiCnWV7rotDOg62R1XfHk9SZVlY-cg63_G9x03zFsgKCFXvA5YulLzqCagVEYo_ay5AMOgUE_L5WfOfL5tXOe8IYbzv4aL5-APjvG9t-I1tSeHQjnMyJcSpPSQcgzvKOYfpth1NMe1dmBZd0P2awp8Z8-vmhTf7jG8ez8tme321XX_tbr592aw_3XSO1sc6akFJIqWX4BRS6hEQvKMDY8YIyYjjFB0Vw-CtsmQkRHnwXHohq50Ml83mFDtGs9OHFO5MetDRBH28iOlWm1SC26MGDr1xktKRjZRwZ60jyokBrGTMMl6z4JTlUsw5oT_nAdFLm7q2qWubemlTL21WRp6Ye7TRZxdwcnjmSF1RgRRKVAV8Hcqxw3Wcp1LRd_-PVveHR3fY48O_J9Ob7ff-83Xti7EKdyd4se3inKb6JU9s9RcKsqtS</recordid><startdate>202011</startdate><enddate>202011</enddate><creator>V E, Sathishkumar</creator><creator>Park, Jangwoo</creator><creator>Cho, Yongyun</creator><general>The Institution of Engineering and Technology</general><general>Wiley</general><scope>AOWDO</scope><scope>BLEPL</scope><scope>DTL</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-8271-2022</orcidid></search><sort><creationdate>202011</creationdate><title>Seoul bike trip duration prediction using data mining techniques</title><author>V E, Sathishkumar ; Park, Jangwoo ; Cho, Yongyun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4005-4b198088f81c9e44fe1e1fc4355aa7850c64ec4733fb9b0d009f1f68f789e403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>coefficient of variance</topic><topic>data mining</topic><topic>data mining techniques</topic><topic>Engineering</topic><topic>Engineering, Electrical & Electronic</topic><topic>feature engineering</topic><topic>feature extraction</topic><topic>gradient boosting machines</topic><topic>intelligent transport systems</topic><topic>intelligent transportation systems</topic><topic>k nearest neighbour</topic><topic>linear regression</topic><topic>mean absolute error</topic><topic>mean square error methods</topic><topic>median absolute error</topic><topic>nearest neighbour methods</topic><topic>Random Forest</topic><topic>random forests</topic><topic>regression analysis</topic><topic>rental bikes</topic><topic>root mean squared error</topic><topic>Science & Technology</topic><topic>Seoul bike data</topic><topic>Seoul bike sharing system</topic><topic>Seoul bike trip duration prediction</topic><topic>Special Issue: Intelligent Transportation Systems in Smart Cities for Sustainable Environments</topic><topic>statistical models</topic><topic>Technology</topic><topic>traffic information systems</topic><topic>Transportation</topic><topic>Transportation Science & Technology</topic><topic>traveller information systems</topic><topic>trip distance</topic><topic>trip‐time prediction</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>V E, Sathishkumar</creatorcontrib><creatorcontrib>Park, Jangwoo</creatorcontrib><creatorcontrib>Cho, Yongyun</creatorcontrib><collection>Web of Science - Science Citation Index Expanded - 2020</collection><collection>Web of Science Core Collection</collection><collection>Science Citation Index Expanded</collection><collection>CrossRef</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IET intelligent transport systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>V E, Sathishkumar</au><au>Park, Jangwoo</au><au>Cho, Yongyun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Seoul bike trip duration prediction using data mining techniques</atitle><jtitle>IET intelligent transport systems</jtitle><stitle>IET INTELL TRANSP SY</stitle><date>2020-11</date><risdate>2020</risdate><volume>14</volume><issue>11</issue><spage>1465</spage><epage>1474</epage><pages>1465-1474</pages><issn>1751-956X</issn><eissn>1751-9578</eissn><abstract>Trip duration is the most fundamental measure in all modes of transportation. Hence, it is crucial to predict the trip-time precisely for the advancement of Intelligent Transport Systems and traveller information systems. To predict the trip duration, data mining techniques are employed in this study to predict the trip duration of rental bikes in Seoul Bike sharing system. The prediction is carried out with the combination of Seoul Bike data and weather data. The data used include trip duration, trip distance, pickup and dropoff latitude and longitude, temperature, precipitation, wind speed, humidity, solar radiation, snowfall, ground temperature and 1-hour average dust concentration. Feature engineering is done to extract additional features from the data. Four statistical models are used to predict the trip duration. (a) Linear regression, (b) Gradient boosting machines, (c) k nearest neighbour and (d) Random Forest (RF). Four performance metrics root mean squared error, coefficient of variance, mean absolute error and median absolute error is used to determine the efficiency of the models. In comparison with the other models, the best model RF can explain the variance of 93% in the testing set and 98% (R2) in the training set. The outcome proves that RF is effective to be employed for the prediction of trip duration.</abstract><cop>HOBOKEN</cop><pub>The Institution of Engineering and Technology</pub><doi>10.1049/iet-its.2019.0796</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-8271-2022</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1751-956X |
ispartof | IET intelligent transport systems, 2020-11, Vol.14 (11), p.1465-1474 |
issn | 1751-956X 1751-9578 |
language | eng |
recordid | cdi_webofscience_primary_000591879700016 |
source | Wiley-Blackwell Open Access Titles |
subjects | coefficient of variance data mining data mining techniques Engineering Engineering, Electrical & Electronic feature engineering feature extraction gradient boosting machines intelligent transport systems intelligent transportation systems k nearest neighbour linear regression mean absolute error mean square error methods median absolute error nearest neighbour methods Random Forest random forests regression analysis rental bikes root mean squared error Science & Technology Seoul bike data Seoul bike sharing system Seoul bike trip duration prediction Special Issue: Intelligent Transportation Systems in Smart Cities for Sustainable Environments statistical models Technology traffic information systems Transportation Transportation Science & Technology traveller information systems trip distance trip‐time prediction |
title | Seoul bike trip duration prediction using data mining techniques |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T05%3A14%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-wiley_24P&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Seoul%20bike%20trip%20duration%20prediction%20using%20data%20mining%20techniques&rft.jtitle=IET%20intelligent%20transport%20systems&rft.au=V%20E,%20Sathishkumar&rft.date=2020-11&rft.volume=14&rft.issue=11&rft.spage=1465&rft.epage=1474&rft.pages=1465-1474&rft.issn=1751-956X&rft.eissn=1751-9578&rft_id=info:doi/10.1049/iet-its.2019.0796&rft_dat=%3Cwiley_24P%3EITR2BF00955%3C/wiley_24P%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_doaj_id=oai_doaj_org_article_1612ac844d5d406cbbc09c731b855b56&rfr_iscdi=true |