Seoul bike trip duration prediction using data mining techniques

Trip duration is the most fundamental measure in all modes of transportation. Hence, it is crucial to predict the trip-time precisely for the advancement of Intelligent Transport Systems and traveller information systems. To predict the trip duration, data mining techniques are employed in this stud...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IET intelligent transport systems 2020-11, Vol.14 (11), p.1465-1474
Hauptverfasser: V E, Sathishkumar, Park, Jangwoo, Cho, Yongyun
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1474
container_issue 11
container_start_page 1465
container_title IET intelligent transport systems
container_volume 14
creator V E, Sathishkumar
Park, Jangwoo
Cho, Yongyun
description Trip duration is the most fundamental measure in all modes of transportation. Hence, it is crucial to predict the trip-time precisely for the advancement of Intelligent Transport Systems and traveller information systems. To predict the trip duration, data mining techniques are employed in this study to predict the trip duration of rental bikes in Seoul Bike sharing system. The prediction is carried out with the combination of Seoul Bike data and weather data. The data used include trip duration, trip distance, pickup and dropoff latitude and longitude, temperature, precipitation, wind speed, humidity, solar radiation, snowfall, ground temperature and 1-hour average dust concentration. Feature engineering is done to extract additional features from the data. Four statistical models are used to predict the trip duration. (a) Linear regression, (b) Gradient boosting machines, (c) k nearest neighbour and (d) Random Forest (RF). Four performance metrics root mean squared error, coefficient of variance, mean absolute error and median absolute error is used to determine the efficiency of the models. In comparison with the other models, the best model RF can explain the variance of 93% in the testing set and 98% (R2) in the training set. The outcome proves that RF is effective to be employed for the prediction of trip duration.
doi_str_mv 10.1049/iet-its.2019.0796
format Article
fullrecord <record><control><sourceid>wiley_24P</sourceid><recordid>TN_cdi_webofscience_primary_000591879700016</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_1612ac844d5d406cbbc09c731b855b56</doaj_id><sourcerecordid>ITR2BF00955</sourcerecordid><originalsourceid>FETCH-LOGICAL-c4005-4b198088f81c9e44fe1e1fc4355aa7850c64ec4733fb9b0d009f1f68f789e403</originalsourceid><addsrcrecordid>eNqNkc1uEzEUhUcIJErhAdjNHk3wnfEvKyBqIVIlJMiCnWV7rotDOg62R1XfHk9SZVlY-cg63_G9x03zFsgKCFXvA5YulLzqCagVEYo_ay5AMOgUE_L5WfOfL5tXOe8IYbzv4aL5-APjvG9t-I1tSeHQjnMyJcSpPSQcgzvKOYfpth1NMe1dmBZd0P2awp8Z8-vmhTf7jG8ez8tme321XX_tbr592aw_3XSO1sc6akFJIqWX4BRS6hEQvKMDY8YIyYjjFB0Vw-CtsmQkRHnwXHohq50Ml83mFDtGs9OHFO5MetDRBH28iOlWm1SC26MGDr1xktKRjZRwZ60jyokBrGTMMl6z4JTlUsw5oT_nAdFLm7q2qWubemlTL21WRp6Ye7TRZxdwcnjmSF1RgRRKVAV8Hcqxw3Wcp1LRd_-PVveHR3fY48O_J9Ob7ff-83Xti7EKdyd4se3inKb6JU9s9RcKsqtS</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Seoul bike trip duration prediction using data mining techniques</title><source>Wiley-Blackwell Open Access Titles</source><creator>V E, Sathishkumar ; Park, Jangwoo ; Cho, Yongyun</creator><creatorcontrib>V E, Sathishkumar ; Park, Jangwoo ; Cho, Yongyun</creatorcontrib><description>Trip duration is the most fundamental measure in all modes of transportation. Hence, it is crucial to predict the trip-time precisely for the advancement of Intelligent Transport Systems and traveller information systems. To predict the trip duration, data mining techniques are employed in this study to predict the trip duration of rental bikes in Seoul Bike sharing system. The prediction is carried out with the combination of Seoul Bike data and weather data. The data used include trip duration, trip distance, pickup and dropoff latitude and longitude, temperature, precipitation, wind speed, humidity, solar radiation, snowfall, ground temperature and 1-hour average dust concentration. Feature engineering is done to extract additional features from the data. Four statistical models are used to predict the trip duration. (a) Linear regression, (b) Gradient boosting machines, (c) k nearest neighbour and (d) Random Forest (RF). Four performance metrics root mean squared error, coefficient of variance, mean absolute error and median absolute error is used to determine the efficiency of the models. In comparison with the other models, the best model RF can explain the variance of 93% in the testing set and 98% (R2) in the training set. The outcome proves that RF is effective to be employed for the prediction of trip duration.</description><identifier>ISSN: 1751-956X</identifier><identifier>EISSN: 1751-9578</identifier><identifier>DOI: 10.1049/iet-its.2019.0796</identifier><language>eng</language><publisher>HOBOKEN: The Institution of Engineering and Technology</publisher><subject>coefficient of variance ; data mining ; data mining techniques ; Engineering ; Engineering, Electrical &amp; Electronic ; feature engineering ; feature extraction ; gradient boosting machines ; intelligent transport systems ; intelligent transportation systems ; k nearest neighbour ; linear regression ; mean absolute error ; mean square error methods ; median absolute error ; nearest neighbour methods ; Random Forest ; random forests ; regression analysis ; rental bikes ; root mean squared error ; Science &amp; Technology ; Seoul bike data ; Seoul bike sharing system ; Seoul bike trip duration prediction ; Special Issue: Intelligent Transportation Systems in Smart Cities for Sustainable Environments ; statistical models ; Technology ; traffic information systems ; Transportation ; Transportation Science &amp; Technology ; traveller information systems ; trip distance ; trip‐time prediction</subject><ispartof>IET intelligent transport systems, 2020-11, Vol.14 (11), p.1465-1474</ispartof><rights>The Institution of Engineering and Technology</rights><rights>2020 The Institution of Engineering and Technology</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>true</woscitedreferencessubscribed><woscitedreferencescount>16</woscitedreferencescount><woscitedreferencesoriginalsourcerecordid>wos000591879700016</woscitedreferencesoriginalsourcerecordid><citedby>FETCH-LOGICAL-c4005-4b198088f81c9e44fe1e1fc4355aa7850c64ec4733fb9b0d009f1f68f789e403</citedby><cites>FETCH-LOGICAL-c4005-4b198088f81c9e44fe1e1fc4355aa7850c64ec4733fb9b0d009f1f68f789e403</cites><orcidid>0000-0002-8271-2022</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1049%2Fiet-its.2019.0796$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1049%2Fiet-its.2019.0796$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,780,784,1416,11561,27923,27924,45573,45574,46051,46475</link.rule.ids><linktorsrc>$$Uhttps://onlinelibrary.wiley.com/doi/abs/10.1049%2Fiet-its.2019.0796$$EView_record_in_Wiley-Blackwell$$FView_record_in_$$GWiley-Blackwell</linktorsrc></links><search><creatorcontrib>V E, Sathishkumar</creatorcontrib><creatorcontrib>Park, Jangwoo</creatorcontrib><creatorcontrib>Cho, Yongyun</creatorcontrib><title>Seoul bike trip duration prediction using data mining techniques</title><title>IET intelligent transport systems</title><addtitle>IET INTELL TRANSP SY</addtitle><description>Trip duration is the most fundamental measure in all modes of transportation. Hence, it is crucial to predict the trip-time precisely for the advancement of Intelligent Transport Systems and traveller information systems. To predict the trip duration, data mining techniques are employed in this study to predict the trip duration of rental bikes in Seoul Bike sharing system. The prediction is carried out with the combination of Seoul Bike data and weather data. The data used include trip duration, trip distance, pickup and dropoff latitude and longitude, temperature, precipitation, wind speed, humidity, solar radiation, snowfall, ground temperature and 1-hour average dust concentration. Feature engineering is done to extract additional features from the data. Four statistical models are used to predict the trip duration. (a) Linear regression, (b) Gradient boosting machines, (c) k nearest neighbour and (d) Random Forest (RF). Four performance metrics root mean squared error, coefficient of variance, mean absolute error and median absolute error is used to determine the efficiency of the models. In comparison with the other models, the best model RF can explain the variance of 93% in the testing set and 98% (R2) in the training set. The outcome proves that RF is effective to be employed for the prediction of trip duration.</description><subject>coefficient of variance</subject><subject>data mining</subject><subject>data mining techniques</subject><subject>Engineering</subject><subject>Engineering, Electrical &amp; Electronic</subject><subject>feature engineering</subject><subject>feature extraction</subject><subject>gradient boosting machines</subject><subject>intelligent transport systems</subject><subject>intelligent transportation systems</subject><subject>k nearest neighbour</subject><subject>linear regression</subject><subject>mean absolute error</subject><subject>mean square error methods</subject><subject>median absolute error</subject><subject>nearest neighbour methods</subject><subject>Random Forest</subject><subject>random forests</subject><subject>regression analysis</subject><subject>rental bikes</subject><subject>root mean squared error</subject><subject>Science &amp; Technology</subject><subject>Seoul bike data</subject><subject>Seoul bike sharing system</subject><subject>Seoul bike trip duration prediction</subject><subject>Special Issue: Intelligent Transportation Systems in Smart Cities for Sustainable Environments</subject><subject>statistical models</subject><subject>Technology</subject><subject>traffic information systems</subject><subject>Transportation</subject><subject>Transportation Science &amp; Technology</subject><subject>traveller information systems</subject><subject>trip distance</subject><subject>trip‐time prediction</subject><issn>1751-956X</issn><issn>1751-9578</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>AOWDO</sourceid><sourceid>DOA</sourceid><recordid>eNqNkc1uEzEUhUcIJErhAdjNHk3wnfEvKyBqIVIlJMiCnWV7rotDOg62R1XfHk9SZVlY-cg63_G9x03zFsgKCFXvA5YulLzqCagVEYo_ay5AMOgUE_L5WfOfL5tXOe8IYbzv4aL5-APjvG9t-I1tSeHQjnMyJcSpPSQcgzvKOYfpth1NMe1dmBZd0P2awp8Z8-vmhTf7jG8ez8tme321XX_tbr592aw_3XSO1sc6akFJIqWX4BRS6hEQvKMDY8YIyYjjFB0Vw-CtsmQkRHnwXHohq50Ml83mFDtGs9OHFO5MetDRBH28iOlWm1SC26MGDr1xktKRjZRwZ60jyokBrGTMMl6z4JTlUsw5oT_nAdFLm7q2qWubemlTL21WRp6Ye7TRZxdwcnjmSF1RgRRKVAV8Hcqxw3Wcp1LRd_-PVveHR3fY48O_J9Ob7ff-83Xti7EKdyd4se3inKb6JU9s9RcKsqtS</recordid><startdate>202011</startdate><enddate>202011</enddate><creator>V E, Sathishkumar</creator><creator>Park, Jangwoo</creator><creator>Cho, Yongyun</creator><general>The Institution of Engineering and Technology</general><general>Wiley</general><scope>AOWDO</scope><scope>BLEPL</scope><scope>DTL</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-8271-2022</orcidid></search><sort><creationdate>202011</creationdate><title>Seoul bike trip duration prediction using data mining techniques</title><author>V E, Sathishkumar ; Park, Jangwoo ; Cho, Yongyun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c4005-4b198088f81c9e44fe1e1fc4355aa7850c64ec4733fb9b0d009f1f68f789e403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>coefficient of variance</topic><topic>data mining</topic><topic>data mining techniques</topic><topic>Engineering</topic><topic>Engineering, Electrical &amp; Electronic</topic><topic>feature engineering</topic><topic>feature extraction</topic><topic>gradient boosting machines</topic><topic>intelligent transport systems</topic><topic>intelligent transportation systems</topic><topic>k nearest neighbour</topic><topic>linear regression</topic><topic>mean absolute error</topic><topic>mean square error methods</topic><topic>median absolute error</topic><topic>nearest neighbour methods</topic><topic>Random Forest</topic><topic>random forests</topic><topic>regression analysis</topic><topic>rental bikes</topic><topic>root mean squared error</topic><topic>Science &amp; Technology</topic><topic>Seoul bike data</topic><topic>Seoul bike sharing system</topic><topic>Seoul bike trip duration prediction</topic><topic>Special Issue: Intelligent Transportation Systems in Smart Cities for Sustainable Environments</topic><topic>statistical models</topic><topic>Technology</topic><topic>traffic information systems</topic><topic>Transportation</topic><topic>Transportation Science &amp; Technology</topic><topic>traveller information systems</topic><topic>trip distance</topic><topic>trip‐time prediction</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>V E, Sathishkumar</creatorcontrib><creatorcontrib>Park, Jangwoo</creatorcontrib><creatorcontrib>Cho, Yongyun</creatorcontrib><collection>Web of Science - Science Citation Index Expanded - 2020</collection><collection>Web of Science Core Collection</collection><collection>Science Citation Index Expanded</collection><collection>CrossRef</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IET intelligent transport systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>V E, Sathishkumar</au><au>Park, Jangwoo</au><au>Cho, Yongyun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Seoul bike trip duration prediction using data mining techniques</atitle><jtitle>IET intelligent transport systems</jtitle><stitle>IET INTELL TRANSP SY</stitle><date>2020-11</date><risdate>2020</risdate><volume>14</volume><issue>11</issue><spage>1465</spage><epage>1474</epage><pages>1465-1474</pages><issn>1751-956X</issn><eissn>1751-9578</eissn><abstract>Trip duration is the most fundamental measure in all modes of transportation. Hence, it is crucial to predict the trip-time precisely for the advancement of Intelligent Transport Systems and traveller information systems. To predict the trip duration, data mining techniques are employed in this study to predict the trip duration of rental bikes in Seoul Bike sharing system. The prediction is carried out with the combination of Seoul Bike data and weather data. The data used include trip duration, trip distance, pickup and dropoff latitude and longitude, temperature, precipitation, wind speed, humidity, solar radiation, snowfall, ground temperature and 1-hour average dust concentration. Feature engineering is done to extract additional features from the data. Four statistical models are used to predict the trip duration. (a) Linear regression, (b) Gradient boosting machines, (c) k nearest neighbour and (d) Random Forest (RF). Four performance metrics root mean squared error, coefficient of variance, mean absolute error and median absolute error is used to determine the efficiency of the models. In comparison with the other models, the best model RF can explain the variance of 93% in the testing set and 98% (R2) in the training set. The outcome proves that RF is effective to be employed for the prediction of trip duration.</abstract><cop>HOBOKEN</cop><pub>The Institution of Engineering and Technology</pub><doi>10.1049/iet-its.2019.0796</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-8271-2022</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1751-956X
ispartof IET intelligent transport systems, 2020-11, Vol.14 (11), p.1465-1474
issn 1751-956X
1751-9578
language eng
recordid cdi_webofscience_primary_000591879700016
source Wiley-Blackwell Open Access Titles
subjects coefficient of variance
data mining
data mining techniques
Engineering
Engineering, Electrical & Electronic
feature engineering
feature extraction
gradient boosting machines
intelligent transport systems
intelligent transportation systems
k nearest neighbour
linear regression
mean absolute error
mean square error methods
median absolute error
nearest neighbour methods
Random Forest
random forests
regression analysis
rental bikes
root mean squared error
Science & Technology
Seoul bike data
Seoul bike sharing system
Seoul bike trip duration prediction
Special Issue: Intelligent Transportation Systems in Smart Cities for Sustainable Environments
statistical models
Technology
traffic information systems
Transportation
Transportation Science & Technology
traveller information systems
trip distance
trip‐time prediction
title Seoul bike trip duration prediction using data mining techniques
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T05%3A14%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-wiley_24P&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Seoul%20bike%20trip%20duration%20prediction%20using%20data%20mining%20techniques&rft.jtitle=IET%20intelligent%20transport%20systems&rft.au=V%20E,%20Sathishkumar&rft.date=2020-11&rft.volume=14&rft.issue=11&rft.spage=1465&rft.epage=1474&rft.pages=1465-1474&rft.issn=1751-956X&rft.eissn=1751-9578&rft_id=info:doi/10.1049/iet-its.2019.0796&rft_dat=%3Cwiley_24P%3EITR2BF00955%3C/wiley_24P%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_doaj_id=oai_doaj_org_article_1612ac844d5d406cbbc09c731b855b56&rfr_iscdi=true