An effective graph summarization and compression technique for a large-scaled graph

Graphs are widely used in various applications, and their size is becoming larger over the passage of time. It is necessary to reduce their size to minimize main memory needs and to save the storage space on disk. For these purposes, graph summarization and compression approaches have been studied i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of supercomputing 2020-10, Vol.76 (10), p.7906-7920
Hauptverfasser: Seo, Hojin, Park, Kisung, Han, Yongkoo, Kim, Hyunwook, Umair, Muhammad, Khan, Kifayat Ullah, Lee, Young-Koo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 7920
container_issue 10
container_start_page 7906
container_title The Journal of supercomputing
container_volume 76
creator Seo, Hojin
Park, Kisung
Han, Yongkoo
Kim, Hyunwook
Umair, Muhammad
Khan, Kifayat Ullah
Lee, Young-Koo
description Graphs are widely used in various applications, and their size is becoming larger over the passage of time. It is necessary to reduce their size to minimize main memory needs and to save the storage space on disk. For these purposes, graph summarization and compression approaches have been studied in various existing studies to reduce the size of a large graph. Graph summarization aggregates nodes having similar structural properties to represent a graph with reduced main memory requirements. Whereas graph compression applies various encoding techniques so that the resultant graph needs lesser storage space on disk. Considering usefulness of both the paradigms, we propose to obtain best of the both worlds by combining summarization and compression approaches. Hence, we present a greedy-based algorithm that greatly reduces the size of a large graph by applying both the compression and summarization. We also propose a novel cost model for calculating the compression ratio considering both the compression and summarization strategies. The algorithm uses the proposed cost model to determine whether to perform one or both of them in every iteration. Through comprehensive experiments on real-world datasets, we show that our proposed algorithm achieves a better compression ratio than only applying summarization approaches by up to 16%.
doi_str_mv 10.1007/s11227-018-2245-5
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2442611857</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2442611857</sourcerecordid><originalsourceid>FETCH-LOGICAL-c316t-76526b713b9c86c6c403dcae168573963f657f04aade1aac69fd93e26dae46d33</originalsourceid><addsrcrecordid>eNp1kEtLAzEUhYMoWKs_wF3AdTTvzCxL8QUFF-o6pMlNO6WdGZOpoL_eDCO4cnW5cM53zz0IXTN6yyg1d5kxzg2hrCKcS0XUCZoxZQShspKnaEZrTkmlJD9HFznvKKVSGDFDr4sWQ4zgh-YT8Ca5fovz8XBwqfl2Q9O12LUB--7QJ8h53Afw27b5OAKOXcIO713aAMne7SFMgEt0Ft0-w9XvnKP3h_u35RNZvTw-Lxcr4gXTAzFacb02TKxrX2mvvaQieAdMVyV4rUXUykQqnQvAnPO6jqEWwHVwIHUQYo5uJm6fupInD3bXHVNbTlouJdeMjaA5YpPKpy7nBNH2qSn_fVlG7didnbqzpTs7dmdV8fDJk4u23UD6I_9v-gHEcHHU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2442611857</pqid></control><display><type>article</type><title>An effective graph summarization and compression technique for a large-scaled graph</title><source>Springer Nature - Complete Springer Journals</source><creator>Seo, Hojin ; Park, Kisung ; Han, Yongkoo ; Kim, Hyunwook ; Umair, Muhammad ; Khan, Kifayat Ullah ; Lee, Young-Koo</creator><creatorcontrib>Seo, Hojin ; Park, Kisung ; Han, Yongkoo ; Kim, Hyunwook ; Umair, Muhammad ; Khan, Kifayat Ullah ; Lee, Young-Koo</creatorcontrib><description>Graphs are widely used in various applications, and their size is becoming larger over the passage of time. It is necessary to reduce their size to minimize main memory needs and to save the storage space on disk. For these purposes, graph summarization and compression approaches have been studied in various existing studies to reduce the size of a large graph. Graph summarization aggregates nodes having similar structural properties to represent a graph with reduced main memory requirements. Whereas graph compression applies various encoding techniques so that the resultant graph needs lesser storage space on disk. Considering usefulness of both the paradigms, we propose to obtain best of the both worlds by combining summarization and compression approaches. Hence, we present a greedy-based algorithm that greatly reduces the size of a large graph by applying both the compression and summarization. We also propose a novel cost model for calculating the compression ratio considering both the compression and summarization strategies. The algorithm uses the proposed cost model to determine whether to perform one or both of them in every iteration. Through comprehensive experiments on real-world datasets, we show that our proposed algorithm achieves a better compression ratio than only applying summarization approaches by up to 16%.</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-018-2245-5</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Compilers ; Compression ratio ; Computer Science ; Greedy algorithms ; Interpreters ; Iterative methods ; Processor Architectures ; Programming Languages</subject><ispartof>The Journal of supercomputing, 2020-10, Vol.76 (10), p.7906-7920</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2018</rights><rights>Springer Science+Business Media, LLC, part of Springer Nature 2018.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c316t-76526b713b9c86c6c403dcae168573963f657f04aade1aac69fd93e26dae46d33</citedby><cites>FETCH-LOGICAL-c316t-76526b713b9c86c6c403dcae168573963f657f04aade1aac69fd93e26dae46d33</cites><orcidid>0000-0003-2314-5395</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-018-2245-5$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-018-2245-5$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27901,27902,41464,42533,51294</link.rule.ids></links><search><creatorcontrib>Seo, Hojin</creatorcontrib><creatorcontrib>Park, Kisung</creatorcontrib><creatorcontrib>Han, Yongkoo</creatorcontrib><creatorcontrib>Kim, Hyunwook</creatorcontrib><creatorcontrib>Umair, Muhammad</creatorcontrib><creatorcontrib>Khan, Kifayat Ullah</creatorcontrib><creatorcontrib>Lee, Young-Koo</creatorcontrib><title>An effective graph summarization and compression technique for a large-scaled graph</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>Graphs are widely used in various applications, and their size is becoming larger over the passage of time. It is necessary to reduce their size to minimize main memory needs and to save the storage space on disk. For these purposes, graph summarization and compression approaches have been studied in various existing studies to reduce the size of a large graph. Graph summarization aggregates nodes having similar structural properties to represent a graph with reduced main memory requirements. Whereas graph compression applies various encoding techniques so that the resultant graph needs lesser storage space on disk. Considering usefulness of both the paradigms, we propose to obtain best of the both worlds by combining summarization and compression approaches. Hence, we present a greedy-based algorithm that greatly reduces the size of a large graph by applying both the compression and summarization. We also propose a novel cost model for calculating the compression ratio considering both the compression and summarization strategies. The algorithm uses the proposed cost model to determine whether to perform one or both of them in every iteration. Through comprehensive experiments on real-world datasets, we show that our proposed algorithm achieves a better compression ratio than only applying summarization approaches by up to 16%.</description><subject>Algorithms</subject><subject>Compilers</subject><subject>Compression ratio</subject><subject>Computer Science</subject><subject>Greedy algorithms</subject><subject>Interpreters</subject><subject>Iterative methods</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNp1kEtLAzEUhYMoWKs_wF3AdTTvzCxL8QUFF-o6pMlNO6WdGZOpoL_eDCO4cnW5cM53zz0IXTN6yyg1d5kxzg2hrCKcS0XUCZoxZQShspKnaEZrTkmlJD9HFznvKKVSGDFDr4sWQ4zgh-YT8Ca5fovz8XBwqfl2Q9O12LUB--7QJ8h53Afw27b5OAKOXcIO713aAMne7SFMgEt0Ft0-w9XvnKP3h_u35RNZvTw-Lxcr4gXTAzFacb02TKxrX2mvvaQieAdMVyV4rUXUykQqnQvAnPO6jqEWwHVwIHUQYo5uJm6fupInD3bXHVNbTlouJdeMjaA5YpPKpy7nBNH2qSn_fVlG7didnbqzpTs7dmdV8fDJk4u23UD6I_9v-gHEcHHU</recordid><startdate>20201001</startdate><enddate>20201001</enddate><creator>Seo, Hojin</creator><creator>Park, Kisung</creator><creator>Han, Yongkoo</creator><creator>Kim, Hyunwook</creator><creator>Umair, Muhammad</creator><creator>Khan, Kifayat Ullah</creator><creator>Lee, Young-Koo</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-2314-5395</orcidid></search><sort><creationdate>20201001</creationdate><title>An effective graph summarization and compression technique for a large-scaled graph</title><author>Seo, Hojin ; Park, Kisung ; Han, Yongkoo ; Kim, Hyunwook ; Umair, Muhammad ; Khan, Kifayat Ullah ; Lee, Young-Koo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c316t-76526b713b9c86c6c403dcae168573963f657f04aade1aac69fd93e26dae46d33</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Compilers</topic><topic>Compression ratio</topic><topic>Computer Science</topic><topic>Greedy algorithms</topic><topic>Interpreters</topic><topic>Iterative methods</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Seo, Hojin</creatorcontrib><creatorcontrib>Park, Kisung</creatorcontrib><creatorcontrib>Han, Yongkoo</creatorcontrib><creatorcontrib>Kim, Hyunwook</creatorcontrib><creatorcontrib>Umair, Muhammad</creatorcontrib><creatorcontrib>Khan, Kifayat Ullah</creatorcontrib><creatorcontrib>Lee, Young-Koo</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Seo, Hojin</au><au>Park, Kisung</au><au>Han, Yongkoo</au><au>Kim, Hyunwook</au><au>Umair, Muhammad</au><au>Khan, Kifayat Ullah</au><au>Lee, Young-Koo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An effective graph summarization and compression technique for a large-scaled graph</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2020-10-01</date><risdate>2020</risdate><volume>76</volume><issue>10</issue><spage>7906</spage><epage>7920</epage><pages>7906-7920</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>Graphs are widely used in various applications, and their size is becoming larger over the passage of time. It is necessary to reduce their size to minimize main memory needs and to save the storage space on disk. For these purposes, graph summarization and compression approaches have been studied in various existing studies to reduce the size of a large graph. Graph summarization aggregates nodes having similar structural properties to represent a graph with reduced main memory requirements. Whereas graph compression applies various encoding techniques so that the resultant graph needs lesser storage space on disk. Considering usefulness of both the paradigms, we propose to obtain best of the both worlds by combining summarization and compression approaches. Hence, we present a greedy-based algorithm that greatly reduces the size of a large graph by applying both the compression and summarization. We also propose a novel cost model for calculating the compression ratio considering both the compression and summarization strategies. The algorithm uses the proposed cost model to determine whether to perform one or both of them in every iteration. Through comprehensive experiments on real-world datasets, we show that our proposed algorithm achieves a better compression ratio than only applying summarization approaches by up to 16%.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-018-2245-5</doi><tpages>15</tpages><orcidid>https://orcid.org/0000-0003-2314-5395</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0920-8542
ispartof The Journal of supercomputing, 2020-10, Vol.76 (10), p.7906-7920
issn 0920-8542
1573-0484
language eng
recordid cdi_proquest_journals_2442611857
source Springer Nature - Complete Springer Journals
subjects Algorithms
Compilers
Compression ratio
Computer Science
Greedy algorithms
Interpreters
Iterative methods
Processor Architectures
Programming Languages
title An effective graph summarization and compression technique for a large-scaled graph
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T22%3A03%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20effective%20graph%20summarization%20and%20compression%20technique%20for%20a%20large-scaled%20graph&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=Seo,%20Hojin&rft.date=2020-10-01&rft.volume=76&rft.issue=10&rft.spage=7906&rft.epage=7920&rft.pages=7906-7920&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-018-2245-5&rft_dat=%3Cproquest_cross%3E2442611857%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2442611857&rft_id=info:pmid/&rfr_iscdi=true