Toward the Routine Analysis of Diverse Data Types
This article describes a variety of data analysis problems. The types of data across these problems included free text, parallel text, an image collection, remote sensing imagery, and network packets. A strategy for approaching the analysis of these diverse types of data is described. A key part of...
Gespeichert in:
Veröffentlicht in: | Journal of computational and graphical statistics 2003-12, Vol.12 (4), p.915-926 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 926 |
---|---|
container_issue | 4 |
container_start_page | 915 |
container_title | Journal of computational and graphical statistics |
container_volume | 12 |
creator | Whitney, Paul Cox, Dennis Daly, Don Sloughter, J. McLean |
description | This article describes a variety of data analysis problems. The types of data across these problems included free text, parallel text, an image collection, remote sensing imagery, and network packets. A strategy for approaching the analysis of these diverse types of data is described. A key part of the challenge is mapping the analytic results back into the original domain and data setting. Additionally, a common computational bottleneck encountered in each of these problems is diagnosed as analysis tools and algorithms with unbounded memory characteristics. This experience and the analysis suggest a research and development path that could greatly extend the scale of problems that can be addressed with routine data analysis tools. In particular, there are opportunities associated with developing theory and functioning algorithms with favorable memory-usage characteristics, and there are opportunities associated with developing methods and theory for describing the outcomes of analyses for the various types of data. |
doi_str_mv | 10.1198/1061860032535_a |
format | Article |
fullrecord | <record><control><sourceid>jstor_cross</sourceid><recordid>TN_cdi_jstor_primary_1390984</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>1390984</jstor_id><sourcerecordid>1390984</sourcerecordid><originalsourceid>FETCH-LOGICAL-c363t-8797eff18cffa24927c22ae6d354f12226ebc9a231acf516ce2f191f09207e5e3</originalsourceid><addsrcrecordid>eNp1kM1LAzEUxIMoWKtnLx4CntfmJc3uxltp_YKCIPUcYvpCU7abkkTL_vduWUEQPDzmwW9mDkPINbA7AFVPgJVQl4wJLoXU5oSMQIqq4BXI0_7vaXHE5-QipS1jDEpVjQiswsHENc0bpG_hM_sW6aw1TZd8osHRhf_CmJAuTDZ01e0xXZIzZ5qEVz86Ju-PD6v5c7F8fXqZz5aFFaXIRV2pCp2D2jpn-FTxynJusFwLOXXAOS_xwyrDBRjrJJQWuQMFjinOKpQoxuR26A0pe52sz2g3NrQt2qxBMmD99a7J4LIxpBTR6X30OxM7DUwfd9F_dukTN0Nim3KIv3ahmKqnPb4fsG9diDtzCLFZ62y6JkQXTWt90uK_7m_2Pm_9</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Toward the Routine Analysis of Diverse Data Types</title><source>Jstor Complete Legacy</source><source>JSTOR Mathematics & Statistics</source><creator>Whitney, Paul ; Cox, Dennis ; Daly, Don ; Sloughter, J. McLean</creator><creatorcontrib>Whitney, Paul ; Cox, Dennis ; Daly, Don ; Sloughter, J. McLean ; Pacific Northwest National Lab. (PNNL), Richland, WA (United States)</creatorcontrib><description>This article describes a variety of data analysis problems. The types of data across these problems included free text, parallel text, an image collection, remote sensing imagery, and network packets. A strategy for approaching the analysis of these diverse types of data is described. A key part of the challenge is mapping the analytic results back into the original domain and data setting. Additionally, a common computational bottleneck encountered in each of these problems is diagnosed as analysis tools and algorithms with unbounded memory characteristics. This experience and the analysis suggest a research and development path that could greatly extend the scale of problems that can be addressed with routine data analysis tools. In particular, there are opportunities associated with developing theory and functioning algorithms with favorable memory-usage characteristics, and there are opportunities associated with developing methods and theory for describing the outcomes of analyses for the various types of data.</description><identifier>ISSN: 1061-8600</identifier><identifier>EISSN: 1537-2715</identifier><identifier>DOI: 10.1198/1061860032535_a</identifier><language>eng</language><publisher>United States: Taylor & Francis</publisher><subject>Algorithms ; Analytics ; Computer memory ; Data analysis ; Data types ; Datasets ; GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE ; Image analysis ; Mathematical vectors ; Multimedia ; Random access memory ; Scaling ; Signatures ; statistics</subject><ispartof>Journal of computational and graphical statistics, 2003-12, Vol.12 (4), p.915-926</ispartof><rights>American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America 2003</rights><rights>Copyright 2003 American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c363t-8797eff18cffa24927c22ae6d354f12226ebc9a231acf516ce2f191f09207e5e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/1390984$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/1390984$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,776,780,799,828,881,27903,27904,57995,57999,58228,58232</link.rule.ids><backlink>$$Uhttps://www.osti.gov/biblio/15010501$$D View this record in Osti.gov$$Hfree_for_read</backlink></links><search><creatorcontrib>Whitney, Paul</creatorcontrib><creatorcontrib>Cox, Dennis</creatorcontrib><creatorcontrib>Daly, Don</creatorcontrib><creatorcontrib>Sloughter, J. McLean</creatorcontrib><creatorcontrib>Pacific Northwest National Lab. (PNNL), Richland, WA (United States)</creatorcontrib><title>Toward the Routine Analysis of Diverse Data Types</title><title>Journal of computational and graphical statistics</title><description>This article describes a variety of data analysis problems. The types of data across these problems included free text, parallel text, an image collection, remote sensing imagery, and network packets. A strategy for approaching the analysis of these diverse types of data is described. A key part of the challenge is mapping the analytic results back into the original domain and data setting. Additionally, a common computational bottleneck encountered in each of these problems is diagnosed as analysis tools and algorithms with unbounded memory characteristics. This experience and the analysis suggest a research and development path that could greatly extend the scale of problems that can be addressed with routine data analysis tools. In particular, there are opportunities associated with developing theory and functioning algorithms with favorable memory-usage characteristics, and there are opportunities associated with developing methods and theory for describing the outcomes of analyses for the various types of data.</description><subject>Algorithms</subject><subject>Analytics</subject><subject>Computer memory</subject><subject>Data analysis</subject><subject>Data types</subject><subject>Datasets</subject><subject>GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE</subject><subject>Image analysis</subject><subject>Mathematical vectors</subject><subject>Multimedia</subject><subject>Random access memory</subject><subject>Scaling</subject><subject>Signatures</subject><subject>statistics</subject><issn>1061-8600</issn><issn>1537-2715</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2003</creationdate><recordtype>article</recordtype><recordid>eNp1kM1LAzEUxIMoWKtnLx4CntfmJc3uxltp_YKCIPUcYvpCU7abkkTL_vduWUEQPDzmwW9mDkPINbA7AFVPgJVQl4wJLoXU5oSMQIqq4BXI0_7vaXHE5-QipS1jDEpVjQiswsHENc0bpG_hM_sW6aw1TZd8osHRhf_CmJAuTDZ01e0xXZIzZ5qEVz86Ju-PD6v5c7F8fXqZz5aFFaXIRV2pCp2D2jpn-FTxynJusFwLOXXAOS_xwyrDBRjrJJQWuQMFjinOKpQoxuR26A0pe52sz2g3NrQt2qxBMmD99a7J4LIxpBTR6X30OxM7DUwfd9F_dukTN0Nim3KIv3ahmKqnPb4fsG9diDtzCLFZ62y6JkQXTWt90uK_7m_2Pm_9</recordid><startdate>20031201</startdate><enddate>20031201</enddate><creator>Whitney, Paul</creator><creator>Cox, Dennis</creator><creator>Daly, Don</creator><creator>Sloughter, J. McLean</creator><general>Taylor & Francis</general><general>American Statistical Association, Institute of Mathematical Statistics, and Interface Foundation of North America</general><scope>AAYXX</scope><scope>CITATION</scope><scope>OTOTI</scope></search><sort><creationdate>20031201</creationdate><title>Toward the Routine Analysis of Diverse Data Types</title><author>Whitney, Paul ; Cox, Dennis ; Daly, Don ; Sloughter, J. McLean</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c363t-8797eff18cffa24927c22ae6d354f12226ebc9a231acf516ce2f191f09207e5e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Algorithms</topic><topic>Analytics</topic><topic>Computer memory</topic><topic>Data analysis</topic><topic>Data types</topic><topic>Datasets</topic><topic>GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE</topic><topic>Image analysis</topic><topic>Mathematical vectors</topic><topic>Multimedia</topic><topic>Random access memory</topic><topic>Scaling</topic><topic>Signatures</topic><topic>statistics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Whitney, Paul</creatorcontrib><creatorcontrib>Cox, Dennis</creatorcontrib><creatorcontrib>Daly, Don</creatorcontrib><creatorcontrib>Sloughter, J. McLean</creatorcontrib><creatorcontrib>Pacific Northwest National Lab. (PNNL), Richland, WA (United States)</creatorcontrib><collection>CrossRef</collection><collection>OSTI.GOV</collection><jtitle>Journal of computational and graphical statistics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Whitney, Paul</au><au>Cox, Dennis</au><au>Daly, Don</au><au>Sloughter, J. McLean</au><aucorp>Pacific Northwest National Lab. (PNNL), Richland, WA (United States)</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Toward the Routine Analysis of Diverse Data Types</atitle><jtitle>Journal of computational and graphical statistics</jtitle><date>2003-12-01</date><risdate>2003</risdate><volume>12</volume><issue>4</issue><spage>915</spage><epage>926</epage><pages>915-926</pages><issn>1061-8600</issn><eissn>1537-2715</eissn><abstract>This article describes a variety of data analysis problems. The types of data across these problems included free text, parallel text, an image collection, remote sensing imagery, and network packets. A strategy for approaching the analysis of these diverse types of data is described. A key part of the challenge is mapping the analytic results back into the original domain and data setting. Additionally, a common computational bottleneck encountered in each of these problems is diagnosed as analysis tools and algorithms with unbounded memory characteristics. This experience and the analysis suggest a research and development path that could greatly extend the scale of problems that can be addressed with routine data analysis tools. In particular, there are opportunities associated with developing theory and functioning algorithms with favorable memory-usage characteristics, and there are opportunities associated with developing methods and theory for describing the outcomes of analyses for the various types of data.</abstract><cop>United States</cop><pub>Taylor & Francis</pub><doi>10.1198/1061860032535_a</doi><tpages>12</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1061-8600 |
ispartof | Journal of computational and graphical statistics, 2003-12, Vol.12 (4), p.915-926 |
issn | 1061-8600 1537-2715 |
language | eng |
recordid | cdi_jstor_primary_1390984 |
source | Jstor Complete Legacy; JSTOR Mathematics & Statistics |
subjects | Algorithms Analytics Computer memory Data analysis Data types Datasets GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE Image analysis Mathematical vectors Multimedia Random access memory Scaling Signatures statistics |
title | Toward the Routine Analysis of Diverse Data Types |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T05%3A07%3A02IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Toward%20the%20Routine%20Analysis%20of%20Diverse%20Data%20Types&rft.jtitle=Journal%20of%20computational%20and%20graphical%20statistics&rft.au=Whitney,%20Paul&rft.aucorp=Pacific%20Northwest%20National%20Lab.%20(PNNL),%20Richland,%20WA%20(United%20States)&rft.date=2003-12-01&rft.volume=12&rft.issue=4&rft.spage=915&rft.epage=926&rft.pages=915-926&rft.issn=1061-8600&rft.eissn=1537-2715&rft_id=info:doi/10.1198/1061860032535_a&rft_dat=%3Cjstor_cross%3E1390984%3C/jstor_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_jstor_id=1390984&rfr_iscdi=true |