Hillview: A trillion-cell spreadsheet for big data

Hillview is a distributed spreadsheet for browsing very large datasets that cannot be handled by a single machine. As a spreadsheet, Hillview provides a high degree of interactivity that permits data analysts to explore information quickly along many dimensions while switching visualizations on a wh...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	arXiv.org 2019-07
Hauptverfasser:	Budiu, Mihai, Gopalan, Parikshit, Suresh, Lalith, Wieder, Udi, Han Kruiger, Aguilera, Marcos K
Format:	Artikel
Sprache:	eng
Schlagworte:	Browsing Computer graphics Computer Science - Distributed, Parallel, and Cluster Computing Datasets Parallel processing Sketches Spreadsheets
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Budiu, Mihai Gopalan, Parikshit Suresh, Lalith Wieder, Udi Han Kruiger Aguilera, Marcos K
description	Hillview is a distributed spreadsheet for browsing very large datasets that cannot be handled by a single machine. As a spreadsheet, Hillview provides a high degree of interactivity that permits data analysts to explore information quickly along many dimensions while switching visualizations on a whim. To provide the required responsiveness, Hillview introduces visualization sketches, or vizketches, as a simple idea to produce compact data visualizations. Vizketches combine algorithmic techniques for data summarization with computer graphics principles for efficient rendering. While simple, vizketches are effective at scaling the spreadsheet by parallelizing computation, reducing communication, providing progressive visualizations, and offering precise accuracy guarantees. Using Hillview running on eight servers, we can navigate and visualize datasets of tens of billions of rows and trillions of cells, much beyond the published capabilities of competing systems.
doi_str_mv	10.48550/arxiv.1907.04827
format	Article
fullrecord	<record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_1907_04827</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2255977823</sourcerecordid><originalsourceid>FETCH-LOGICAL-a523-7af4de19f44313dc05a0a600293f66030a37b2a2ddc6250f55f8340c307b546e3</originalsourceid><addsrcrecordid>eNotz1FLwzAQB_AgCI65D-CTAZ9br5ekaX0bQ50w8GXv5dokmlHXmnRTv71x8-k4-PO_-zF2U0AuK6XgnsK3P-ZFDToHWaG-YDMUosgqiXjFFjHuAABLjUqJGcO17_ujt18PfMmnkBY_7LPO9j2PY7Bk4ru1E3dD4K1_44YmumaXjvpoF_9zzrZPj9vVOtu8Pr-slpuMFIpMk5PGFrWTUhTCdKAIqEyXa-HKEgSQ0C0SGtOVqMAp5SohoROgWyVLK-bs9lx7AjVj8B8Ufpo_WHOCpcTdOTGG4fNg49TshkPYp58aTLpa6yrJfwH0g05g</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2255977823</pqid></control><display><type>article</type><title>Hillview: A trillion-cell spreadsheet for big data</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Budiu, Mihai ; Gopalan, Parikshit ; Suresh, Lalith ; Wieder, Udi ; Han Kruiger ; Aguilera, Marcos K</creator><creatorcontrib>Budiu, Mihai ; Gopalan, Parikshit ; Suresh, Lalith ; Wieder, Udi ; Han Kruiger ; Aguilera, Marcos K</creatorcontrib><description>Hillview is a distributed spreadsheet for browsing very large datasets that cannot be handled by a single machine. As a spreadsheet, Hillview provides a high degree of interactivity that permits data analysts to explore information quickly along many dimensions while switching visualizations on a whim. To provide the required responsiveness, Hillview introduces visualization sketches, or vizketches, as a simple idea to produce compact data visualizations. Vizketches combine algorithmic techniques for data summarization with computer graphics principles for efficient rendering. While simple, vizketches are effective at scaling the spreadsheet by parallelizing computation, reducing communication, providing progressive visualizations, and offering precise accuracy guarantees. Using Hillview running on eight servers, we can navigate and visualize datasets of tens of billions of rows and trillions of cells, much beyond the published capabilities of competing systems.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.1907.04827</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Browsing ; Computer graphics ; Computer Science - Distributed, Parallel, and Cluster Computing ; Datasets ; Parallel processing ; Sketches ; Spreadsheets</subject><ispartof>arXiv.org, 2019-07</ispartof><rights>2019. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,784,885,27925</link.rule.ids><backlink>$$Uhttps://doi.org/10.48550/arXiv.1907.04827$$DView paper in arXiv$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.14778/3342263.3342279$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink></links><search><creatorcontrib>Budiu, Mihai</creatorcontrib><creatorcontrib>Gopalan, Parikshit</creatorcontrib><creatorcontrib>Suresh, Lalith</creatorcontrib><creatorcontrib>Wieder, Udi</creatorcontrib><creatorcontrib>Han Kruiger</creatorcontrib><creatorcontrib>Aguilera, Marcos K</creatorcontrib><title>Hillview: A trillion-cell spreadsheet for big data</title><title>arXiv.org</title><description>Hillview is a distributed spreadsheet for browsing very large datasets that cannot be handled by a single machine. As a spreadsheet, Hillview provides a high degree of interactivity that permits data analysts to explore information quickly along many dimensions while switching visualizations on a whim. To provide the required responsiveness, Hillview introduces visualization sketches, or vizketches, as a simple idea to produce compact data visualizations. Vizketches combine algorithmic techniques for data summarization with computer graphics principles for efficient rendering. While simple, vizketches are effective at scaling the spreadsheet by parallelizing computation, reducing communication, providing progressive visualizations, and offering precise accuracy guarantees. Using Hillview running on eight servers, we can navigate and visualize datasets of tens of billions of rows and trillions of cells, much beyond the published capabilities of competing systems.</description><subject>Browsing</subject><subject>Computer graphics</subject><subject>Computer Science - Distributed, Parallel, and Cluster Computing</subject><subject>Datasets</subject><subject>Parallel processing</subject><subject>Sketches</subject><subject>Spreadsheets</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotz1FLwzAQB_AgCI65D-CTAZ9br5ekaX0bQ50w8GXv5dokmlHXmnRTv71x8-k4-PO_-zF2U0AuK6XgnsK3P-ZFDToHWaG-YDMUosgqiXjFFjHuAABLjUqJGcO17_ujt18PfMmnkBY_7LPO9j2PY7Bk4ru1E3dD4K1_44YmumaXjvpoF_9zzrZPj9vVOtu8Pr-slpuMFIpMk5PGFrWTUhTCdKAIqEyXa-HKEgSQ0C0SGtOVqMAp5SohoROgWyVLK-bs9lx7AjVj8B8Ufpo_WHOCpcTdOTGG4fNg49TshkPYp58aTLpa6yrJfwH0g05g</recordid><startdate>20190710</startdate><enddate>20190710</enddate><creator>Budiu, Mihai</creator><creator>Gopalan, Parikshit</creator><creator>Suresh, Lalith</creator><creator>Wieder, Udi</creator><creator>Han Kruiger</creator><creator>Aguilera, Marcos K</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20190710</creationdate><title>Hillview: A trillion-cell spreadsheet for big data</title><author>Budiu, Mihai ; Gopalan, Parikshit ; Suresh, Lalith ; Wieder, Udi ; Han Kruiger ; Aguilera, Marcos K</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a523-7af4de19f44313dc05a0a600293f66030a37b2a2ddc6250f55f8340c307b546e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Browsing</topic><topic>Computer graphics</topic><topic>Computer Science - Distributed, Parallel, and Cluster Computing</topic><topic>Datasets</topic><topic>Parallel processing</topic><topic>Sketches</topic><topic>Spreadsheets</topic><toplevel>online_resources</toplevel><creatorcontrib>Budiu, Mihai</creatorcontrib><creatorcontrib>Gopalan, Parikshit</creatorcontrib><creatorcontrib>Suresh, Lalith</creatorcontrib><creatorcontrib>Wieder, Udi</creatorcontrib><creatorcontrib>Han Kruiger</creatorcontrib><creatorcontrib>Aguilera, Marcos K</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Budiu, Mihai</au><au>Gopalan, Parikshit</au><au>Suresh, Lalith</au><au>Wieder, Udi</au><au>Han Kruiger</au><au>Aguilera, Marcos K</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Hillview: A trillion-cell spreadsheet for big data</atitle><jtitle>arXiv.org</jtitle><date>2019-07-10</date><risdate>2019</risdate><eissn>2331-8422</eissn><abstract>Hillview is a distributed spreadsheet for browsing very large datasets that cannot be handled by a single machine. As a spreadsheet, Hillview provides a high degree of interactivity that permits data analysts to explore information quickly along many dimensions while switching visualizations on a whim. To provide the required responsiveness, Hillview introduces visualization sketches, or vizketches, as a simple idea to produce compact data visualizations. Vizketches combine algorithmic techniques for data summarization with computer graphics principles for efficient rendering. While simple, vizketches are effective at scaling the spreadsheet by parallelizing computation, reducing communication, providing progressive visualizations, and offering precise accuracy guarantees. Using Hillview running on eight servers, we can navigate and visualize datasets of tens of billions of rows and trillions of cells, much beyond the published capabilities of competing systems.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.1907.04827</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2019-07
issn	2331-8422
language	eng
recordid	cdi_arxiv_primary_1907_04827
source	arXiv.org; Free E- Journals
subjects	Browsing Computer graphics Computer Science - Distributed, Parallel, and Cluster Computing Datasets Parallel processing Sketches Spreadsheets
title	Hillview: A trillion-cell spreadsheet for big data
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T03%3A43%3A50IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Hillview:%20A%20trillion-cell%20spreadsheet%20for%20big%20data&rft.jtitle=arXiv.org&rft.au=Budiu,%20Mihai&rft.date=2019-07-10&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.1907.04827&rft_dat=%3Cproquest_arxiv%3E2255977823%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2255977823&rft_id=info:pmid/&rfr_iscdi=true