Faster Population Counts Using AVX2 Instructions
Abstract Counting the number of ones in a binary stream is a common operation in database, information-retrieval, cryptographic and machine-learning applications. Most processors have dedicated instructions to count the number of ones in a word (e.g. popcnt on ×64 processors). Maybe surprisingly, we...
Gespeichert in:
Veröffentlicht in: | Computer journal 2018-01, Vol.61 (1), p.111-120 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 120 |
---|---|
container_issue | 1 |
container_start_page | 111 |
container_title | Computer journal |
container_volume | 61 |
creator | Muła, Wojciech Kurz, Nathan Lemire, Daniel |
description | Abstract
Counting the number of ones in a binary stream is a common operation in database, information-retrieval, cryptographic and machine-learning applications. Most processors have dedicated instructions to count the number of ones in a word (e.g. popcnt on ×64 processors). Maybe surprisingly, we show that a vectorized approach using SIMD instructions can be twice as fast as using the dedicated instructions on recent Intel processors. The benefits can be even greater for applications such as similarity measures (e.g. the Jaccard index) that require additional Boolean operations. Our approach has been adopted by LLVM: it is used by its popular C compiler (Clang). |
doi_str_mv | 10.1093/comjnl/bxx046 |
format | Article |
fullrecord | <record><control><sourceid>oup_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1093_comjnl_bxx046</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/comjnl/bxx046</oup_id><sourcerecordid>10.1093/comjnl/bxx046</sourcerecordid><originalsourceid>FETCH-LOGICAL-c336t-b7af9c3ebf90290ac97e6404bd54497410057d542e1ada8257e6c7f15ce7a0b13</originalsourceid><addsrcrecordid>eNqFj0FLwzAAhYMoWKdH7z16iXtJ08QcR3E6GOjBibeSZql0dE1JWpj_3o569_QevI8HHyH3DI8MOltafzx07bI6nSDkBUmYkKAcUl2SBGCgQnJck5sYDwA4tEwI1iYOLqTvvh9bMzS-Sws_dkNMd7HpvtPV5xdPN10cwmjPa7wlV7Vpo7v7ywXZrZ8_ile6fXvZFKsttVkmB1opU2ubuarW4BrGauWkgKj2uRBaCQbkaurcMbM3TzyfZqtqllunDCqWLQidf23wMQZXl31ojib8lAzlWbecdctZd-IfZt6P_T_oL_JrWBk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Faster Population Counts Using AVX2 Instructions</title><source>Oxford University Press Journals All Titles (1996-Current)</source><creator>Muła, Wojciech ; Kurz, Nathan ; Lemire, Daniel</creator><creatorcontrib>Muła, Wojciech ; Kurz, Nathan ; Lemire, Daniel</creatorcontrib><description>Abstract
Counting the number of ones in a binary stream is a common operation in database, information-retrieval, cryptographic and machine-learning applications. Most processors have dedicated instructions to count the number of ones in a word (e.g. popcnt on ×64 processors). Maybe surprisingly, we show that a vectorized approach using SIMD instructions can be twice as fast as using the dedicated instructions on recent Intel processors. The benefits can be even greater for applications such as similarity measures (e.g. the Jaccard index) that require additional Boolean operations. Our approach has been adopted by LLVM: it is used by its popular C compiler (Clang).</description><identifier>ISSN: 0010-4620</identifier><identifier>EISSN: 1460-2067</identifier><identifier>DOI: 10.1093/comjnl/bxx046</identifier><language>eng</language><publisher>Oxford University Press</publisher><ispartof>Computer journal, 2018-01, Vol.61 (1), p.111-120</ispartof><rights>The British Computer Society 2017. All rights reserved. For Permissions, please email: journals.permissions@oup.com 2017</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c336t-b7af9c3ebf90290ac97e6404bd54497410057d542e1ada8257e6c7f15ce7a0b13</citedby><cites>FETCH-LOGICAL-c336t-b7af9c3ebf90290ac97e6404bd54497410057d542e1ada8257e6c7f15ce7a0b13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,1578,27901,27902</link.rule.ids></links><search><creatorcontrib>Muła, Wojciech</creatorcontrib><creatorcontrib>Kurz, Nathan</creatorcontrib><creatorcontrib>Lemire, Daniel</creatorcontrib><title>Faster Population Counts Using AVX2 Instructions</title><title>Computer journal</title><description>Abstract
Counting the number of ones in a binary stream is a common operation in database, information-retrieval, cryptographic and machine-learning applications. Most processors have dedicated instructions to count the number of ones in a word (e.g. popcnt on ×64 processors). Maybe surprisingly, we show that a vectorized approach using SIMD instructions can be twice as fast as using the dedicated instructions on recent Intel processors. The benefits can be even greater for applications such as similarity measures (e.g. the Jaccard index) that require additional Boolean operations. Our approach has been adopted by LLVM: it is used by its popular C compiler (Clang).</description><issn>0010-4620</issn><issn>1460-2067</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNqFj0FLwzAAhYMoWKdH7z16iXtJ08QcR3E6GOjBibeSZql0dE1JWpj_3o569_QevI8HHyH3DI8MOltafzx07bI6nSDkBUmYkKAcUl2SBGCgQnJck5sYDwA4tEwI1iYOLqTvvh9bMzS-Sws_dkNMd7HpvtPV5xdPN10cwmjPa7wlV7Vpo7v7ywXZrZ8_ile6fXvZFKsttVkmB1opU2ubuarW4BrGauWkgKj2uRBaCQbkaurcMbM3TzyfZqtqllunDCqWLQidf23wMQZXl31ojib8lAzlWbecdctZd-IfZt6P_T_oL_JrWBk</recordid><startdate>20180101</startdate><enddate>20180101</enddate><creator>Muła, Wojciech</creator><creator>Kurz, Nathan</creator><creator>Lemire, Daniel</creator><general>Oxford University Press</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20180101</creationdate><title>Faster Population Counts Using AVX2 Instructions</title><author>Muła, Wojciech ; Kurz, Nathan ; Lemire, Daniel</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c336t-b7af9c3ebf90290ac97e6404bd54497410057d542e1ada8257e6c7f15ce7a0b13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Muła, Wojciech</creatorcontrib><creatorcontrib>Kurz, Nathan</creatorcontrib><creatorcontrib>Lemire, Daniel</creatorcontrib><collection>CrossRef</collection><jtitle>Computer journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Muła, Wojciech</au><au>Kurz, Nathan</au><au>Lemire, Daniel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Faster Population Counts Using AVX2 Instructions</atitle><jtitle>Computer journal</jtitle><date>2018-01-01</date><risdate>2018</risdate><volume>61</volume><issue>1</issue><spage>111</spage><epage>120</epage><pages>111-120</pages><issn>0010-4620</issn><eissn>1460-2067</eissn><abstract>Abstract
Counting the number of ones in a binary stream is a common operation in database, information-retrieval, cryptographic and machine-learning applications. Most processors have dedicated instructions to count the number of ones in a word (e.g. popcnt on ×64 processors). Maybe surprisingly, we show that a vectorized approach using SIMD instructions can be twice as fast as using the dedicated instructions on recent Intel processors. The benefits can be even greater for applications such as similarity measures (e.g. the Jaccard index) that require additional Boolean operations. Our approach has been adopted by LLVM: it is used by its popular C compiler (Clang).</abstract><pub>Oxford University Press</pub><doi>10.1093/comjnl/bxx046</doi><tpages>10</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0010-4620 |
ispartof | Computer journal, 2018-01, Vol.61 (1), p.111-120 |
issn | 0010-4620 1460-2067 |
language | eng |
recordid | cdi_crossref_primary_10_1093_comjnl_bxx046 |
source | Oxford University Press Journals All Titles (1996-Current) |
title | Faster Population Counts Using AVX2 Instructions |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T06%3A17%3A34IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-oup_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Faster%20Population%20Counts%20Using%20AVX2%20Instructions&rft.jtitle=Computer%20journal&rft.au=Mu%C5%82a,%20Wojciech&rft.date=2018-01-01&rft.volume=61&rft.issue=1&rft.spage=111&rft.epage=120&rft.pages=111-120&rft.issn=0010-4620&rft.eissn=1460-2067&rft_id=info:doi/10.1093/comjnl/bxx046&rft_dat=%3Coup_cross%3E10.1093/comjnl/bxx046%3C/oup_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_oup_id=10.1093/comjnl/bxx046&rfr_iscdi=true |