Compacting points-to sets through object clustering

Inclusion-based set constraint solving is the most popular technique for whole-program points-to analysis whereby an analysis is typically formulated as repeatedly resolving constraints between points-to sets of program variables. The set union operation is central to this process. The number of poi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Proceedings of ACM on programming languages 2021-10, Vol.5 (OOPSLA), p.1-27
Hauptverfasser:	Barbar, Mohamad, Sui, Yulei
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	27
container_issue	OOPSLA
container_start_page	1
container_title	Proceedings of ACM on programming languages
container_volume	5
creator	Barbar, Mohamad Sui, Yulei
description	Inclusion-based set constraint solving is the most popular technique for whole-program points-to analysis whereby an analysis is typically formulated as repeatedly resolving constraints between points-to sets of program variables. The set union operation is central to this process. The number of points-to sets can grow as analyses become more precise and input programs become larger, resulting in more time spent performing unions and more space used storing these points-to sets. Most existing approaches focus on improving scalability of precise points-to analyses from an algorithmic perspective and there has been less research into improving the data structures behind the analyses. Bit-vectors as one of the more popular data structures have been used in several mainstream analysis frameworks to represent points-to sets. To store memory objects in bit-vectors, objects need to mapped to integral identifiers. We observe that this object-to-identifier mapping is critical for a compact points-to set representation and the set union operation. If objects in the same points-to sets (co-pointees) are not given numerically close identifiers, points-to resolution can cost significantly more space and time. Without data on the unpredictable points-to relations which would be discovered by the analysis, an ideal mapping is extremely challenging. In this paper, we present a new approach to inclusion-based analysis by compacting points-to sets through object clustering. Inspired by recent staged analysis where an auxiliary analysis produces results approximating a more precise main analysis, we formulate points-to set compaction as an optimisation problem solved by integer programming using constraints generated from the auxiliary analysis’s results in order to produce an effective mapping. We then develop a more approximate mapping, yet much more efficiently, using hierarchical clustering to compact bit-vectors. We also develop an improved representation of bit-vectors (called core bit-vectors) to fully take advantage of the newly produced mapping. Our approach requires no algorithmic change to the points-to analysis. We evaluate our object clustering on flow sensitive points-to analysis using 8 open-source programs (>3.1 million lines of LLVM instructions) and our results show that our approach can successfully improve the analysis with an up to 1.83× speed up and an up to 4.05× reduction in memory usage.
doi_str_mv	10.1145/3485547
format	Article
fullrecord	<record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3485547</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1145_3485547</sourcerecordid><originalsourceid>FETCH-LOGICAL-c258t-73c5b2cb55c5814d0f05da280178ddd9e8b32ae333c8b392c95974ee7ca2ddbe3</originalsourceid><addsrcrecordid>eNpNj81KAzEYRYMoWGrxFbJzFc3fR5KlDP5BoRtdD8mXTDulbYYkXfj2jtiFq3sWhwuHkHvBH4XQ8KS0BdDmiiykNsCEluL6H9-SVa17zrlws6ncgqguHyePbTxt6ZTHU6usZVpTq7TtSj5vdzSHfcJG8XCuLZVZvCM3gz_UtLrskny9vnx272y9efvontcMJdjGjEIIEgMAghU68oFD9NJyYWyM0SUblPRJKYUzOYkOnNEpGfQyxpDUkjz8_WLJtZY09FMZj75894L3v7n9JVf9ALyuRuk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Compacting points-to sets through object clustering</title><source>ACM Digital Library Complete</source><source>EZB-FREE-00999 freely available EZB journals</source><creator>Barbar, Mohamad ; Sui, Yulei</creator><creatorcontrib>Barbar, Mohamad ; Sui, Yulei</creatorcontrib><description>Inclusion-based set constraint solving is the most popular technique for whole-program points-to analysis whereby an analysis is typically formulated as repeatedly resolving constraints between points-to sets of program variables. The set union operation is central to this process. The number of points-to sets can grow as analyses become more precise and input programs become larger, resulting in more time spent performing unions and more space used storing these points-to sets. Most existing approaches focus on improving scalability of precise points-to analyses from an algorithmic perspective and there has been less research into improving the data structures behind the analyses. Bit-vectors as one of the more popular data structures have been used in several mainstream analysis frameworks to represent points-to sets. To store memory objects in bit-vectors, objects need to mapped to integral identifiers. We observe that this object-to-identifier mapping is critical for a compact points-to set representation and the set union operation. If objects in the same points-to sets (co-pointees) are not given numerically close identifiers, points-to resolution can cost significantly more space and time. Without data on the unpredictable points-to relations which would be discovered by the analysis, an ideal mapping is extremely challenging. In this paper, we present a new approach to inclusion-based analysis by compacting points-to sets through object clustering. Inspired by recent staged analysis where an auxiliary analysis produces results approximating a more precise main analysis, we formulate points-to set compaction as an optimisation problem solved by integer programming using constraints generated from the auxiliary analysis’s results in order to produce an effective mapping. We then develop a more approximate mapping, yet much more efficiently, using hierarchical clustering to compact bit-vectors. We also develop an improved representation of bit-vectors (called core bit-vectors) to fully take advantage of the newly produced mapping. Our approach requires no algorithmic change to the points-to analysis. We evaluate our object clustering on flow sensitive points-to analysis using 8 open-source programs (>3.1 million lines of LLVM instructions) and our results show that our approach can successfully improve the analysis with an up to 1.83× speed up and an up to 4.05× reduction in memory usage.</description><identifier>ISSN: 2475-1421</identifier><identifier>EISSN: 2475-1421</identifier><identifier>DOI: 10.1145/3485547</identifier><language>eng</language><ispartof>Proceedings of ACM on programming languages, 2021-10, Vol.5 (OOPSLA), p.1-27</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c258t-73c5b2cb55c5814d0f05da280178ddd9e8b32ae333c8b392c95974ee7ca2ddbe3</citedby><cites>FETCH-LOGICAL-c258t-73c5b2cb55c5814d0f05da280178ddd9e8b32ae333c8b392c95974ee7ca2ddbe3</cites><orcidid>0000-0002-9510-6574</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Barbar, Mohamad</creatorcontrib><creatorcontrib>Sui, Yulei</creatorcontrib><title>Compacting points-to sets through object clustering</title><title>Proceedings of ACM on programming languages</title><description>Inclusion-based set constraint solving is the most popular technique for whole-program points-to analysis whereby an analysis is typically formulated as repeatedly resolving constraints between points-to sets of program variables. The set union operation is central to this process. The number of points-to sets can grow as analyses become more precise and input programs become larger, resulting in more time spent performing unions and more space used storing these points-to sets. Most existing approaches focus on improving scalability of precise points-to analyses from an algorithmic perspective and there has been less research into improving the data structures behind the analyses. Bit-vectors as one of the more popular data structures have been used in several mainstream analysis frameworks to represent points-to sets. To store memory objects in bit-vectors, objects need to mapped to integral identifiers. We observe that this object-to-identifier mapping is critical for a compact points-to set representation and the set union operation. If objects in the same points-to sets (co-pointees) are not given numerically close identifiers, points-to resolution can cost significantly more space and time. Without data on the unpredictable points-to relations which would be discovered by the analysis, an ideal mapping is extremely challenging. In this paper, we present a new approach to inclusion-based analysis by compacting points-to sets through object clustering. Inspired by recent staged analysis where an auxiliary analysis produces results approximating a more precise main analysis, we formulate points-to set compaction as an optimisation problem solved by integer programming using constraints generated from the auxiliary analysis’s results in order to produce an effective mapping. We then develop a more approximate mapping, yet much more efficiently, using hierarchical clustering to compact bit-vectors. We also develop an improved representation of bit-vectors (called core bit-vectors) to fully take advantage of the newly produced mapping. Our approach requires no algorithmic change to the points-to analysis. We evaluate our object clustering on flow sensitive points-to analysis using 8 open-source programs (>3.1 million lines of LLVM instructions) and our results show that our approach can successfully improve the analysis with an up to 1.83× speed up and an up to 4.05× reduction in memory usage.</description><issn>2475-1421</issn><issn>2475-1421</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNpNj81KAzEYRYMoWGrxFbJzFc3fR5KlDP5BoRtdD8mXTDulbYYkXfj2jtiFq3sWhwuHkHvBH4XQ8KS0BdDmiiykNsCEluL6H9-SVa17zrlws6ncgqguHyePbTxt6ZTHU6usZVpTq7TtSj5vdzSHfcJG8XCuLZVZvCM3gz_UtLrskny9vnx272y9efvontcMJdjGjEIIEgMAghU68oFD9NJyYWyM0SUblPRJKYUzOYkOnNEpGfQyxpDUkjz8_WLJtZY09FMZj75894L3v7n9JVf9ALyuRuk</recordid><startdate>20211001</startdate><enddate>20211001</enddate><creator>Barbar, Mohamad</creator><creator>Sui, Yulei</creator><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-9510-6574</orcidid></search><sort><creationdate>20211001</creationdate><title>Compacting points-to sets through object clustering</title><author>Barbar, Mohamad ; Sui, Yulei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c258t-73c5b2cb55c5814d0f05da280178ddd9e8b32ae333c8b392c95974ee7ca2ddbe3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Barbar, Mohamad</creatorcontrib><creatorcontrib>Sui, Yulei</creatorcontrib><collection>CrossRef</collection><jtitle>Proceedings of ACM on programming languages</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Barbar, Mohamad</au><au>Sui, Yulei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Compacting points-to sets through object clustering</atitle><jtitle>Proceedings of ACM on programming languages</jtitle><date>2021-10-01</date><risdate>2021</risdate><volume>5</volume><issue>OOPSLA</issue><spage>1</spage><epage>27</epage><pages>1-27</pages><issn>2475-1421</issn><eissn>2475-1421</eissn><abstract>Inclusion-based set constraint solving is the most popular technique for whole-program points-to analysis whereby an analysis is typically formulated as repeatedly resolving constraints between points-to sets of program variables. The set union operation is central to this process. The number of points-to sets can grow as analyses become more precise and input programs become larger, resulting in more time spent performing unions and more space used storing these points-to sets. Most existing approaches focus on improving scalability of precise points-to analyses from an algorithmic perspective and there has been less research into improving the data structures behind the analyses. Bit-vectors as one of the more popular data structures have been used in several mainstream analysis frameworks to represent points-to sets. To store memory objects in bit-vectors, objects need to mapped to integral identifiers. We observe that this object-to-identifier mapping is critical for a compact points-to set representation and the set union operation. If objects in the same points-to sets (co-pointees) are not given numerically close identifiers, points-to resolution can cost significantly more space and time. Without data on the unpredictable points-to relations which would be discovered by the analysis, an ideal mapping is extremely challenging. In this paper, we present a new approach to inclusion-based analysis by compacting points-to sets through object clustering. Inspired by recent staged analysis where an auxiliary analysis produces results approximating a more precise main analysis, we formulate points-to set compaction as an optimisation problem solved by integer programming using constraints generated from the auxiliary analysis’s results in order to produce an effective mapping. We then develop a more approximate mapping, yet much more efficiently, using hierarchical clustering to compact bit-vectors. We also develop an improved representation of bit-vectors (called core bit-vectors) to fully take advantage of the newly produced mapping. Our approach requires no algorithmic change to the points-to analysis. We evaluate our object clustering on flow sensitive points-to analysis using 8 open-source programs (>3.1 million lines of LLVM instructions) and our results show that our approach can successfully improve the analysis with an up to 1.83× speed up and an up to 4.05× reduction in memory usage.</abstract><doi>10.1145/3485547</doi><tpages>27</tpages><orcidid>https://orcid.org/0000-0002-9510-6574</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2475-1421
ispartof	Proceedings of ACM on programming languages, 2021-10, Vol.5 (OOPSLA), p.1-27
issn	2475-1421 2475-1421
language	eng
recordid	cdi_crossref_primary_10_1145_3485547
source	ACM Digital Library Complete; EZB-FREE-00999 freely available EZB journals
title	Compacting points-to sets through object clustering
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T13%3A23%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Compacting%20points-to%20sets%20through%20object%20clustering&rft.jtitle=Proceedings%20of%20ACM%20on%20programming%20languages&rft.au=Barbar,%20Mohamad&rft.date=2021-10-01&rft.volume=5&rft.issue=OOPSLA&rft.spage=1&rft.epage=27&rft.pages=1-27&rft.issn=2475-1421&rft.eissn=2475-1421&rft_id=info:doi/10.1145/3485547&rft_dat=%3Ccrossref%3E10_1145_3485547%3C/crossref%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true