Unique in the metro system: The likelihood to re-identify a metro user with limited trajectory points

Though the collection of metro smart card data could help improve the operations of the metro system, the release of such data might lead to privacy issues. Few studies have quantified the probability to re-identify a user from the smart card data using very limited trajectory points. Thus, this stu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Physica A 2023-10, Vol.628, p.129176, Article 129176
Hauptverfasser: Yang, Hongtai, Ping, An, Wei, Hongmin, Zhai, Guocong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Though the collection of metro smart card data could help improve the operations of the metro system, the release of such data might lead to privacy issues. Few studies have quantified the probability to re-identify a user from the smart card data using very limited trajectory points. Thus, this study investigates this topic by analyzing eight-day metro smart card data of Chengdu, China. Results reveal that, on the macro level, three random trajectory points with a temporal resolution of one minute and one hour are enough to identify over 90% and 67% of the users. Even when the resolution is reduced to one day, 20% of the users could be still be identified by three points. On the individual level, three carefully selected points with a temporal resolution of one minute, one hour, and one day could lead to a re-identification risk no less than 0.5 for 99%, 89%, and 52% of the users. The effects of number of points, number of users, and other temporal resolutions are also thoroughly evaluated. These findings emphasize the great privacy issues involved in the release of metro smart card data and remind metro operators to take proactive measures to enhance privacy protection. •Uniqueness and re-identification risk of metro users in trip data are quantified.•Three random trajectory points could identify over 90% of the users.•Three points could raise the re-identification risk of 99% of the users up to 0.5.•Effects of number of points, number of users, and temporal resolutions are evaluated.•Results reveal the privacy issues involved in the release of metro smart card data.
ISSN:0378-4371
1873-2119
DOI:10.1016/j.physa.2023.129176