High‐performance implementation of a two‐bit geohash coding technique for nearest neighbor search

Summary Insights from geohash coding algorithms introduce significant opportunities for various spatial applications. However, these algorithms require massive storage, complex bit manipulation, and extensive code modification when scaled to higher dimensions. In this article, we have developed a tw...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Concurrency and computation 2021-03, Vol.33 (5), p.n/a
Hauptverfasser:	M, Varalakshmi, Kesarkar, Amit P., Lopez, Daphne
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms climate data assimilation Coding Data points many‐core processors Massive data points Microprocessors nearest neighbor parallel geohash Partitions Performance evaluation spatial applications Spatial data Statistical methods Storage two‐bit geohash coding
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	n/a
container_issue	5
container_start_page
container_title	Concurrency and computation
container_volume	33
creator	M, Varalakshmi Kesarkar, Amit P. Lopez, Daphne
description	Summary Insights from geohash coding algorithms introduce significant opportunities for various spatial applications. However, these algorithms require massive storage, complex bit manipulation, and extensive code modification when scaled to higher dimensions. In this article, we have developed a two‐bit geohash coding algorithm that divides the search space into four equal partitions where each partition is assigned a two‐bit label as 00, 01, 10, and 11, which helps to uniquely identify a chosen data point and the two neighbors on its either side, taken along a particular dimension. This salient feature of the algorithm simplifies the generation of geohash code for the neighboring grid cells. In addition, it achieves efficient memory utilization by storing the geohash values of the training points as integers. Demonstrated by experiments for climate data assimilation, model‐to‐observation space mapping with a geohash code length of 24 bits for Lat‐Lon extent of India has shown favorable results with an accuracy of 85%. Performance and scalability evaluation of the proposed algorithm, optimized for multicore and many‐core processors has shown significant speedups outperforming a tree‐based approach. This algorithm provides a foundation for new spatial statistical methods that can be used for pattern discovery and detection in spatial big data.
doi_str_mv	10.1002/cpe.6029
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2488767616</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2488767616</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2939-2d075a2208d50be69a160b49f4544658532e99f106fee79b6b5910f31624a3963</originalsourceid><addsrcrecordid>eNp1kEFOwzAQRS0EEqUgcQRLbNikjJ3EiZeoKhSpEixgbTnpuHHVxMFOVXXHETgjJ8GliB2rGX29-TPzCblmMGEA_K7ucSKAyxMyYnnKExBpdvrXc3FOLkJYAzAGKRsRnNtV8_Xx2aM3zre6q5Hatt9gi92gB-s66gzVdNi5SFV2oCt0jQ4Nrd3Sdis6YN109n2LNM7TDrXHMMQabasohCjUzSU5M3oT8Oq3jsnbw-x1Ok8Wz49P0_tFUnOZyoQvocg151Auc6hQSM0EVJk0WZ5lIi_jDyilYSAMYiErUeWSgUmZ4JlOpUjH5Obo23sXTwqDWrut7-JKxbOyLEQh2IG6PVK1dyF4NKr3ttV-rxioQ4gqhqgOIUY0OaI7u8H9v5yavsx--G9uX3RS</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2488767616</pqid></control><display><type>article</type><title>High‐performance implementation of a two‐bit geohash coding technique for nearest neighbor search</title><source>Wiley Online Library Journals Frontfile Complete</source><creator>M, Varalakshmi ; Kesarkar, Amit P. ; Lopez, Daphne</creator><creatorcontrib>M, Varalakshmi ; Kesarkar, Amit P. ; Lopez, Daphne</creatorcontrib><description>Summary Insights from geohash coding algorithms introduce significant opportunities for various spatial applications. However, these algorithms require massive storage, complex bit manipulation, and extensive code modification when scaled to higher dimensions. In this article, we have developed a two‐bit geohash coding algorithm that divides the search space into four equal partitions where each partition is assigned a two‐bit label as 00, 01, 10, and 11, which helps to uniquely identify a chosen data point and the two neighbors on its either side, taken along a particular dimension. This salient feature of the algorithm simplifies the generation of geohash code for the neighboring grid cells. In addition, it achieves efficient memory utilization by storing the geohash values of the training points as integers. Demonstrated by experiments for climate data assimilation, model‐to‐observation space mapping with a geohash code length of 24 bits for Lat‐Lon extent of India has shown favorable results with an accuracy of 85%. Performance and scalability evaluation of the proposed algorithm, optimized for multicore and many‐core processors has shown significant speedups outperforming a tree‐based approach. This algorithm provides a foundation for new spatial statistical methods that can be used for pattern discovery and detection in spatial big data.</description><identifier>ISSN: 1532-0626</identifier><identifier>EISSN: 1532-0634</identifier><identifier>DOI: 10.1002/cpe.6029</identifier><language>eng</language><publisher>Hoboken: Wiley Subscription Services, Inc</publisher><subject>Algorithms ; climate data assimilation ; Coding ; Data points ; many‐core processors ; Massive data points ; Microprocessors ; nearest neighbor ; parallel geohash ; Partitions ; Performance evaluation ; spatial applications ; Spatial data ; Statistical methods ; Storage ; two‐bit geohash coding</subject><ispartof>Concurrency and computation, 2021-03, Vol.33 (5), p.n/a</ispartof><rights>2020 John Wiley & Sons Ltd</rights><rights>2021 John Wiley & Sons, Ltd.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c2939-2d075a2208d50be69a160b49f4544658532e99f106fee79b6b5910f31624a3963</citedby><cites>FETCH-LOGICAL-c2939-2d075a2208d50be69a160b49f4544658532e99f106fee79b6b5910f31624a3963</cites><orcidid>0000-0002-6069-0088</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fcpe.6029$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fcpe.6029$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>314,776,780,1411,27901,27902,45550,45551</link.rule.ids></links><search><creatorcontrib>M, Varalakshmi</creatorcontrib><creatorcontrib>Kesarkar, Amit P.</creatorcontrib><creatorcontrib>Lopez, Daphne</creatorcontrib><title>High‐performance implementation of a two‐bit geohash coding technique for nearest neighbor search</title><title>Concurrency and computation</title><description>Summary Insights from geohash coding algorithms introduce significant opportunities for various spatial applications. However, these algorithms require massive storage, complex bit manipulation, and extensive code modification when scaled to higher dimensions. In this article, we have developed a two‐bit geohash coding algorithm that divides the search space into four equal partitions where each partition is assigned a two‐bit label as 00, 01, 10, and 11, which helps to uniquely identify a chosen data point and the two neighbors on its either side, taken along a particular dimension. This salient feature of the algorithm simplifies the generation of geohash code for the neighboring grid cells. In addition, it achieves efficient memory utilization by storing the geohash values of the training points as integers. Demonstrated by experiments for climate data assimilation, model‐to‐observation space mapping with a geohash code length of 24 bits for Lat‐Lon extent of India has shown favorable results with an accuracy of 85%. Performance and scalability evaluation of the proposed algorithm, optimized for multicore and many‐core processors has shown significant speedups outperforming a tree‐based approach. This algorithm provides a foundation for new spatial statistical methods that can be used for pattern discovery and detection in spatial big data.</description><subject>Algorithms</subject><subject>climate data assimilation</subject><subject>Coding</subject><subject>Data points</subject><subject>many‐core processors</subject><subject>Massive data points</subject><subject>Microprocessors</subject><subject>nearest neighbor</subject><subject>parallel geohash</subject><subject>Partitions</subject><subject>Performance evaluation</subject><subject>spatial applications</subject><subject>Spatial data</subject><subject>Statistical methods</subject><subject>Storage</subject><subject>two‐bit geohash coding</subject><issn>1532-0626</issn><issn>1532-0634</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNp1kEFOwzAQRS0EEqUgcQRLbNikjJ3EiZeoKhSpEixgbTnpuHHVxMFOVXXHETgjJ8GliB2rGX29-TPzCblmMGEA_K7ucSKAyxMyYnnKExBpdvrXc3FOLkJYAzAGKRsRnNtV8_Xx2aM3zre6q5Hatt9gi92gB-s66gzVdNi5SFV2oCt0jQ4Nrd3Sdis6YN109n2LNM7TDrXHMMQabasohCjUzSU5M3oT8Oq3jsnbw-x1Ok8Wz49P0_tFUnOZyoQvocg151Auc6hQSM0EVJk0WZ5lIi_jDyilYSAMYiErUeWSgUmZ4JlOpUjH5Obo23sXTwqDWrut7-JKxbOyLEQh2IG6PVK1dyF4NKr3ttV-rxioQ4gqhqgOIUY0OaI7u8H9v5yavsx--G9uX3RS</recordid><startdate>20210310</startdate><enddate>20210310</enddate><creator>M, Varalakshmi</creator><creator>Kesarkar, Amit P.</creator><creator>Lopez, Daphne</creator><general>Wiley Subscription Services, Inc</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-6069-0088</orcidid></search><sort><creationdate>20210310</creationdate><title>High‐performance implementation of a two‐bit geohash coding technique for nearest neighbor search</title><author>M, Varalakshmi ; Kesarkar, Amit P. ; Lopez, Daphne</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2939-2d075a2208d50be69a160b49f4544658532e99f106fee79b6b5910f31624a3963</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Algorithms</topic><topic>climate data assimilation</topic><topic>Coding</topic><topic>Data points</topic><topic>many‐core processors</topic><topic>Massive data points</topic><topic>Microprocessors</topic><topic>nearest neighbor</topic><topic>parallel geohash</topic><topic>Partitions</topic><topic>Performance evaluation</topic><topic>spatial applications</topic><topic>Spatial data</topic><topic>Statistical methods</topic><topic>Storage</topic><topic>two‐bit geohash coding</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>M, Varalakshmi</creatorcontrib><creatorcontrib>Kesarkar, Amit P.</creatorcontrib><creatorcontrib>Lopez, Daphne</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Concurrency and computation</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>M, Varalakshmi</au><au>Kesarkar, Amit P.</au><au>Lopez, Daphne</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>High‐performance implementation of a two‐bit geohash coding technique for nearest neighbor search</atitle><jtitle>Concurrency and computation</jtitle><date>2021-03-10</date><risdate>2021</risdate><volume>33</volume><issue>5</issue><epage>n/a</epage><issn>1532-0626</issn><eissn>1532-0634</eissn><abstract>Summary Insights from geohash coding algorithms introduce significant opportunities for various spatial applications. However, these algorithms require massive storage, complex bit manipulation, and extensive code modification when scaled to higher dimensions. In this article, we have developed a two‐bit geohash coding algorithm that divides the search space into four equal partitions where each partition is assigned a two‐bit label as 00, 01, 10, and 11, which helps to uniquely identify a chosen data point and the two neighbors on its either side, taken along a particular dimension. This salient feature of the algorithm simplifies the generation of geohash code for the neighboring grid cells. In addition, it achieves efficient memory utilization by storing the geohash values of the training points as integers. Demonstrated by experiments for climate data assimilation, model‐to‐observation space mapping with a geohash code length of 24 bits for Lat‐Lon extent of India has shown favorable results with an accuracy of 85%. Performance and scalability evaluation of the proposed algorithm, optimized for multicore and many‐core processors has shown significant speedups outperforming a tree‐based approach. This algorithm provides a foundation for new spatial statistical methods that can be used for pattern discovery and detection in spatial big data.</abstract><cop>Hoboken</cop><pub>Wiley Subscription Services, Inc</pub><doi>10.1002/cpe.6029</doi><tpages>23</tpages><orcidid>https://orcid.org/0000-0002-6069-0088</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 1532-0626
ispartof	Concurrency and computation, 2021-03, Vol.33 (5), p.n/a
issn	1532-0626 1532-0634
language	eng
recordid	cdi_proquest_journals_2488767616
source	Wiley Online Library Journals Frontfile Complete
subjects	Algorithms climate data assimilation Coding Data points many‐core processors Massive data points Microprocessors nearest neighbor parallel geohash Partitions Performance evaluation spatial applications Spatial data Statistical methods Storage two‐bit geohash coding
title	High‐performance implementation of a two‐bit geohash coding technique for nearest neighbor search
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T20%3A59%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=High%E2%80%90performance%20implementation%20of%20a%20two%E2%80%90bit%20geohash%20coding%20technique%20for%20nearest%20neighbor%20search&rft.jtitle=Concurrency%20and%20computation&rft.au=M,%20Varalakshmi&rft.date=2021-03-10&rft.volume=33&rft.issue=5&rft.epage=n/a&rft.issn=1532-0626&rft.eissn=1532-0634&rft_id=info:doi/10.1002/cpe.6029&rft_dat=%3Cproquest_cross%3E2488767616%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2488767616&rft_id=info:pmid/&rfr_iscdi=true