Toward the Routine Analysis of Diverse Data Types

This article describes a variety of data analysis problems. The types of data across these problems included free text, parallel text, an image collection, remote sensing imagery, and network packets. A strategy for approaching the analysis of these diverse types of data is described. A key part of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of computational and graphical statistics 2003-12, Vol.12 (4), p.915-926
Hauptverfasser: Whitney, Paul, Cox, Dennis, Daly, Don, Sloughter, J. McLean
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 926
container_issue 4
container_start_page 915
container_title Journal of computational and graphical statistics
container_volume 12
creator Whitney, Paul
Cox, Dennis
Daly, Don
Sloughter, J. McLean
description This article describes a variety of data analysis problems. The types of data across these problems included free text, parallel text, an image collection, remote sensing imagery, and network packets. A strategy for approaching the analysis of these diverse types of data is described. A key part of the challenge is mapping the analytic results back into the original domain and data setting. Additionally, a common computational bottleneck encountered in each of these problems is diagnosed as analysis tools and algorithms with unbounded memory characteristics. This experience and the analysis suggest a research and development path that could greatly extend the scale of problems that can be addressed with routine data analysis tools. In particular, there are opportunities associated with developing theory and functioning algorithms with favorable memory-usage characteristics, and there are opportunities associated with developing methods and theory for describing the outcomes of analyses for the various types of data.
doi_str_mv 10.1198/1061860032535_a
format Article
fullrecord <record><control><sourceid>jstor_cross</sourceid><recordid>TN_cdi_jstor_primary_1390984</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>1390984</jstor_id><sourcerecordid>1390984</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-8797eff18cffa24927c22ae6d354f12226ebc9a231acf516ce2f191f09207e5e3</originalsourceid><addsrcrecordid>eNp1kM1LAzEUxIMoWKtnLx4CntfmJc3uxltp_YKCIPUcYvpCU7abkkTL_vduWUEQPDzmwW9mDkPINbA7AFVPgJVQl4wJLoXU5oSMQIqq4BXI0_7vaXHE5-QipS1jDEpVjQiswsHENc0bpG_hM_sW6aw1TZd8osHRhf_CmJAuTDZ01e0xXZIzZ5qEVz86Ju-PD6v5c7F8fXqZz5aFFaXIRV2pCp2D2jpn-FTxynJusFwLOXXAOS_xwyrDBRjrJJQWuQMFjinOKpQoxuR26A0pe52sz2g3NrQt2qxBMmD99a7J4LIxpBTR6X30OxM7DUwfd9F_dukTN0Nim3KIv3ahmKqnPb4fsG9diDtzCLFZ62y6JkQXTWt90uK_7m_2Pm_9</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Toward the Routine Analysis of Diverse Data Types</title><source>Jstor Complete Legacy</source><source>JSTOR Mathematics &amp; Statistics</source><creator>Whitney, Paul ; Cox, Dennis ; Daly, Don ; Sloughter, J. McLean</creator><creatorcontrib>Whitney, Paul ; Cox, Dennis ; Daly, Don ; Sloughter, J. McLean ; Pacific Northwest National Lab. (PNNL), Richland, WA (United States)</creatorcontrib><description>This article describes a variety of data analysis problems. The types of data across these problems included free text, parallel text, an image collection, remote sensing imagery, and network packets. A strategy for approaching the analysis of these diverse types of data is described. A key part of the challenge is mapping the analytic results back into the original domain and data setting. Additionally, a common computational bottleneck encountered in each of these problems is diagnosed as analysis tools and algorithms with unbounded memory characteristics. This experience and the analysis suggest a research and development path that could greatly extend the scale of problems that can be addressed with routine data analysis tools. In particular, there are opportunities associated with developing theory and functioning algorithms with favorable memory-usage characteristics, and there are opportunities associated with developing methods and theory for describing the outcomes of analyses for the various types of data.</description><identifier>ISSN: 1061-8600</identifier><identifier>EISSN: 1537-2715</identifier><identifier>DOI: 10.1198/1061860032535_a</identifier><language>eng</language><publisher>United States: Taylor &amp; Francis</publisher><subject>Algorithms ; Analytics ; Computer memory ; Data analysis ; Data types ; Datasets ; GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE ; Image analysis ; Mathematical vectors ; Multimedia ; Random access memory ; Scaling ; Signatures ; statistics</subject><ispartof>Journal of computational and graphical statistics, 2003-12, Vol.12 (4), p.915-926</ispartof><rights>American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America 2003</rights><rights>Copyright 2003 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c363t-8797eff18cffa24927c22ae6d354f12226ebc9a231acf516ce2f191f09207e5e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/1390984$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/1390984$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,776,780,799,828,881,27903,27904,57995,57999,58228,58232</link.rule.ids><backlink>$$Uhttps://www.osti.gov/biblio/15010501$$D View this record in Osti.gov$$Hfree_for_read</backlink></links><search><creatorcontrib>Whitney, Paul</creatorcontrib><creatorcontrib>Cox, Dennis</creatorcontrib><creatorcontrib>Daly, Don</creatorcontrib><creatorcontrib>Sloughter, J. McLean</creatorcontrib><creatorcontrib>Pacific Northwest National Lab. (PNNL), Richland, WA (United States)</creatorcontrib><title>Toward the Routine Analysis of Diverse Data Types</title><title>Journal of computational and graphical statistics</title><description>This article describes a variety of data analysis problems. The types of data across these problems included free text, parallel text, an image collection, remote sensing imagery, and network packets. A strategy for approaching the analysis of these diverse types of data is described. A key part of the challenge is mapping the analytic results back into the original domain and data setting. Additionally, a common computational bottleneck encountered in each of these problems is diagnosed as analysis tools and algorithms with unbounded memory characteristics. This experience and the analysis suggest a research and development path that could greatly extend the scale of problems that can be addressed with routine data analysis tools. In particular, there are opportunities associated with developing theory and functioning algorithms with favorable memory-usage characteristics, and there are opportunities associated with developing methods and theory for describing the outcomes of analyses for the various types of data.</description><subject>Algorithms</subject><subject>Analytics</subject><subject>Computer memory</subject><subject>Data analysis</subject><subject>Data types</subject><subject>Datasets</subject><subject>GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE</subject><subject>Image analysis</subject><subject>Mathematical vectors</subject><subject>Multimedia</subject><subject>Random access memory</subject><subject>Scaling</subject><subject>Signatures</subject><subject>statistics</subject><issn>1061-8600</issn><issn>1537-2715</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2003</creationdate><recordtype>article</recordtype><recordid>eNp1kM1LAzEUxIMoWKtnLx4CntfmJc3uxltp_YKCIPUcYvpCU7abkkTL_vduWUEQPDzmwW9mDkPINbA7AFVPgJVQl4wJLoXU5oSMQIqq4BXI0_7vaXHE5-QipS1jDEpVjQiswsHENc0bpG_hM_sW6aw1TZd8osHRhf_CmJAuTDZ01e0xXZIzZ5qEVz86Ju-PD6v5c7F8fXqZz5aFFaXIRV2pCp2D2jpn-FTxynJusFwLOXXAOS_xwyrDBRjrJJQWuQMFjinOKpQoxuR26A0pe52sz2g3NrQt2qxBMmD99a7J4LIxpBTR6X30OxM7DUwfd9F_dukTN0Nim3KIv3ahmKqnPb4fsG9diDtzCLFZ62y6JkQXTWt90uK_7m_2Pm_9</recordid><startdate>20031201</startdate><enddate>20031201</enddate><creator>Whitney, Paul</creator><creator>Cox, Dennis</creator><creator>Daly, Don</creator><creator>Sloughter, J. McLean</creator><general>Taylor &amp; Francis</general><general>American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America</general><scope>AAYXX</scope><scope>CITATION</scope><scope>OTOTI</scope></search><sort><creationdate>20031201</creationdate><title>Toward the Routine Analysis of Diverse Data Types</title><author>Whitney, Paul ; Cox, Dennis ; Daly, Don ; Sloughter, J. McLean</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-8797eff18cffa24927c22ae6d354f12226ebc9a231acf516ce2f191f09207e5e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Algorithms</topic><topic>Analytics</topic><topic>Computer memory</topic><topic>Data analysis</topic><topic>Data types</topic><topic>Datasets</topic><topic>GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE</topic><topic>Image analysis</topic><topic>Mathematical vectors</topic><topic>Multimedia</topic><topic>Random access memory</topic><topic>Scaling</topic><topic>Signatures</topic><topic>statistics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Whitney, Paul</creatorcontrib><creatorcontrib>Cox, Dennis</creatorcontrib><creatorcontrib>Daly, Don</creatorcontrib><creatorcontrib>Sloughter, J. McLean</creatorcontrib><creatorcontrib>Pacific Northwest National Lab. (PNNL), Richland, WA (United States)</creatorcontrib><collection>CrossRef</collection><collection>OSTI.GOV</collection><jtitle>Journal of computational and graphical statistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Whitney, Paul</au><au>Cox, Dennis</au><au>Daly, Don</au><au>Sloughter, J. McLean</au><aucorp>Pacific Northwest National Lab. (PNNL), Richland, WA (United States)</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Toward the Routine Analysis of Diverse Data Types</atitle><jtitle>Journal of computational and graphical statistics</jtitle><date>2003-12-01</date><risdate>2003</risdate><volume>12</volume><issue>4</issue><spage>915</spage><epage>926</epage><pages>915-926</pages><issn>1061-8600</issn><eissn>1537-2715</eissn><abstract>This article describes a variety of data analysis problems. The types of data across these problems included free text, parallel text, an image collection, remote sensing imagery, and network packets. A strategy for approaching the analysis of these diverse types of data is described. A key part of the challenge is mapping the analytic results back into the original domain and data setting. Additionally, a common computational bottleneck encountered in each of these problems is diagnosed as analysis tools and algorithms with unbounded memory characteristics. This experience and the analysis suggest a research and development path that could greatly extend the scale of problems that can be addressed with routine data analysis tools. In particular, there are opportunities associated with developing theory and functioning algorithms with favorable memory-usage characteristics, and there are opportunities associated with developing methods and theory for describing the outcomes of analyses for the various types of data.</abstract><cop>United States</cop><pub>Taylor &amp; Francis</pub><doi>10.1198/1061860032535_a</doi><tpages>12</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1061-8600
ispartof Journal of computational and graphical statistics, 2003-12, Vol.12 (4), p.915-926
issn 1061-8600
1537-2715
language eng
recordid cdi_jstor_primary_1390984
source Jstor Complete Legacy; JSTOR Mathematics & Statistics
subjects Algorithms
Analytics
Computer memory
Data analysis
Data types
Datasets
GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE
Image analysis
Mathematical vectors
Multimedia
Random access memory
Scaling
Signatures
statistics
title Toward the Routine Analysis of Diverse Data Types
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T05%3A07%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Toward%20the%20Routine%20Analysis%20of%20Diverse%20Data%20Types&rft.jtitle=Journal%20of%20computational%20and%20graphical%20statistics&rft.au=Whitney,%20Paul&rft.aucorp=Pacific%20Northwest%20National%20Lab.%20(PNNL),%20Richland,%20WA%20(United%20States)&rft.date=2003-12-01&rft.volume=12&rft.issue=4&rft.spage=915&rft.epage=926&rft.pages=915-926&rft.issn=1061-8600&rft.eissn=1537-2715&rft_id=info:doi/10.1198/1061860032535_a&rft_dat=%3Cjstor_cross%3E1390984%3C/jstor_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_jstor_id=1390984&rfr_iscdi=true