Working with and Visualizing Big Data Efficiently with Python for the DARPA XDATA Program

Research performed under the XDATA program focused on computational techniques and software tools for analyzing large volumes of data, both semi-structured (e.g. tabular, relational, categorical, meta-data) and unstructured (e.g. text, documents, message traffic). Several open source project which h...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Oliphant,Travis, Wang,Peter, Seibert,Stan, Rocklin,Matthew, Van de Ven,Bryan, Sparra,Hunt
Format:	Report
Sprache:	eng
Schlagworte:	algorithms Big Data CLUSTERING computer program documentation computer programming Computer Programming and Software DATA VISUALIZATION high performance computing information systems InteractivE machine learning PYTHON PROGRAMMING LANGUAGE SOFTWARE TOOLS web applications workload
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Oliphant,Travis Wang,Peter Seibert,Stan Rocklin,Matthew Van de Ven,Bryan Sparra,Hunt
description	Research performed under the XDATA program focused on computational techniques and software tools for analyzing large volumes of data, both semi-structured (e.g. tabular, relational, categorical, meta-data) and unstructured (e.g. text, documents, message traffic). Several open source project which have seen community and industry adoption grew out of this effort. - Blaze: A collection packages for describing and accessing, and manipulating disparate data sources and types - Numba: A just-in-time function compiler for Python, based on LLVM compiler project allowing researchers to run their Python code near native speeds on CPUs and GPUs. - Dask: Parallelizes generic Python and extends NumPy, Pandas, and Scikit-learn with parallel variants. -Bokeh: Create interactive web applications from Python without having to know Javascript, CSS, or HTML.
format	Report
fullrecord	<record><control><sourceid>dtic_1RU</sourceid><recordid>TN_cdi_dtic_stinet_AD1038470</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>AD1038470</sourcerecordid><originalsourceid>FETCH-dtic_stinet_AD10384703</originalsourceid><addsrcrecordid>eNrjZIgMzy_KzsxLVyjPLMlQSMxLUQjLLC5NzMmsAgk6ZaYruCSWJCq4pqVlJmem5pXkVEJUBlSWZOTnKaTlFymUZKQquDgGBTgqRLg4hjgqBBTlpxcl5vIwsKYl5hSn8kJpbgYZN9cQZw_dlJLM5Pjiksy81JJ4RxdDA2MLE3MDYwLSANW1NcM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>report</recordtype></control><display><type>report</type><title>Working with and Visualizing Big Data Efficiently with Python for the DARPA XDATA Program</title><source>DTIC Technical Reports</source><creator>Oliphant,Travis ; Wang,Peter ; Seibert,Stan ; Rocklin,Matthew ; Van de Ven,Bryan ; Sparra,Hunt</creator><creatorcontrib>Oliphant,Travis ; Wang,Peter ; Seibert,Stan ; Rocklin,Matthew ; Van de Ven,Bryan ; Sparra,Hunt ; Continuum Analytics, Inc. Austin United States</creatorcontrib><description>Research performed under the XDATA program focused on computational techniques and software tools for analyzing large volumes of data, both semi-structured (e.g. tabular, relational, categorical, meta-data) and unstructured (e.g. text, documents, message traffic). Several open source project which have seen community and industry adoption grew out of this effort. - Blaze: A collection packages for describing and accessing, and manipulating disparate data sources and types - Numba: A just-in-time function compiler for Python, based on LLVM compiler project allowing researchers to run their Python code near native speeds on CPUs and GPUs. - Dask: Parallelizes generic Python and extends NumPy, Pandas, and Scikit-learn with parallel variants. -Bokeh: Create interactive web applications from Python without having to know Javascript, CSS, or HTML.</description><language>eng</language><subject>algorithms ; Big Data ; CLUSTERING ; computer program documentation ; computer programming ; Computer Programming and Software ; DATA VISUALIZATION ; high performance computing ; information systems ; InteractivE ; machine learning ; PYTHON PROGRAMMING LANGUAGE ; SOFTWARE TOOLS ; web applications ; workload</subject><creationdate>2017</creationdate><rights>Approved For Public Release</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,776,881,27544,27545</link.rule.ids><linktorsrc>$$Uhttps://apps.dtic.mil/sti/citations/AD1038470$$EView_record_in_DTIC$$FView_record_in_$$GDTIC$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Oliphant,Travis</creatorcontrib><creatorcontrib>Wang,Peter</creatorcontrib><creatorcontrib>Seibert,Stan</creatorcontrib><creatorcontrib>Rocklin,Matthew</creatorcontrib><creatorcontrib>Van de Ven,Bryan</creatorcontrib><creatorcontrib>Sparra,Hunt</creatorcontrib><creatorcontrib>Continuum Analytics, Inc. Austin United States</creatorcontrib><title>Working with and Visualizing Big Data Efficiently with Python for the DARPA XDATA Program</title><description>Research performed under the XDATA program focused on computational techniques and software tools for analyzing large volumes of data, both semi-structured (e.g. tabular, relational, categorical, meta-data) and unstructured (e.g. text, documents, message traffic). Several open source project which have seen community and industry adoption grew out of this effort. - Blaze: A collection packages for describing and accessing, and manipulating disparate data sources and types - Numba: A just-in-time function compiler for Python, based on LLVM compiler project allowing researchers to run their Python code near native speeds on CPUs and GPUs. - Dask: Parallelizes generic Python and extends NumPy, Pandas, and Scikit-learn with parallel variants. -Bokeh: Create interactive web applications from Python without having to know Javascript, CSS, or HTML.</description><subject>algorithms</subject><subject>Big Data</subject><subject>CLUSTERING</subject><subject>computer program documentation</subject><subject>computer programming</subject><subject>Computer Programming and Software</subject><subject>DATA VISUALIZATION</subject><subject>high performance computing</subject><subject>information systems</subject><subject>InteractivE</subject><subject>machine learning</subject><subject>PYTHON PROGRAMMING LANGUAGE</subject><subject>SOFTWARE TOOLS</subject><subject>web applications</subject><subject>workload</subject><fulltext>true</fulltext><rsrctype>report</rsrctype><creationdate>2017</creationdate><recordtype>report</recordtype><sourceid>1RU</sourceid><recordid>eNrjZIgMzy_KzsxLVyjPLMlQSMxLUQjLLC5NzMmsAgk6ZaYruCSWJCq4pqVlJmem5pXkVEJUBlSWZOTnKaTlFymUZKQquDgGBTgqRLg4hjgqBBTlpxcl5vIwsKYl5hSn8kJpbgYZN9cQZw_dlJLM5Pjiksy81JJ4RxdDA2MLE3MDYwLSANW1NcM</recordid><startdate>20170801</startdate><enddate>20170801</enddate><creator>Oliphant,Travis</creator><creator>Wang,Peter</creator><creator>Seibert,Stan</creator><creator>Rocklin,Matthew</creator><creator>Van de Ven,Bryan</creator><creator>Sparra,Hunt</creator><scope>1RU</scope><scope>BHM</scope></search><sort><creationdate>20170801</creationdate><title>Working with and Visualizing Big Data Efficiently with Python for the DARPA XDATA Program</title><author>Oliphant,Travis ; Wang,Peter ; Seibert,Stan ; Rocklin,Matthew ; Van de Ven,Bryan ; Sparra,Hunt</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-dtic_stinet_AD10384703</frbrgroupid><rsrctype>reports</rsrctype><prefilter>reports</prefilter><language>eng</language><creationdate>2017</creationdate><topic>algorithms</topic><topic>Big Data</topic><topic>CLUSTERING</topic><topic>computer program documentation</topic><topic>computer programming</topic><topic>Computer Programming and Software</topic><topic>DATA VISUALIZATION</topic><topic>high performance computing</topic><topic>information systems</topic><topic>InteractivE</topic><topic>machine learning</topic><topic>PYTHON PROGRAMMING LANGUAGE</topic><topic>SOFTWARE TOOLS</topic><topic>web applications</topic><topic>workload</topic><toplevel>online_resources</toplevel><creatorcontrib>Oliphant,Travis</creatorcontrib><creatorcontrib>Wang,Peter</creatorcontrib><creatorcontrib>Seibert,Stan</creatorcontrib><creatorcontrib>Rocklin,Matthew</creatorcontrib><creatorcontrib>Van de Ven,Bryan</creatorcontrib><creatorcontrib>Sparra,Hunt</creatorcontrib><creatorcontrib>Continuum Analytics, Inc. Austin United States</creatorcontrib><collection>DTIC Technical Reports</collection><collection>DTIC STINET</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Oliphant,Travis</au><au>Wang,Peter</au><au>Seibert,Stan</au><au>Rocklin,Matthew</au><au>Van de Ven,Bryan</au><au>Sparra,Hunt</au><aucorp>Continuum Analytics, Inc. Austin United States</aucorp><format>book</format><genre>unknown</genre><ristype>RPRT</ristype><btitle>Working with and Visualizing Big Data Efficiently with Python for the DARPA XDATA Program</btitle><date>2017-08-01</date><risdate>2017</risdate><abstract>Research performed under the XDATA program focused on computational techniques and software tools for analyzing large volumes of data, both semi-structured (e.g. tabular, relational, categorical, meta-data) and unstructured (e.g. text, documents, message traffic). Several open source project which have seen community and industry adoption grew out of this effort. - Blaze: A collection packages for describing and accessing, and manipulating disparate data sources and types - Numba: A just-in-time function compiler for Python, based on LLVM compiler project allowing researchers to run their Python code near native speeds on CPUs and GPUs. - Dask: Parallelizes generic Python and extends NumPy, Pandas, and Scikit-learn with parallel variants. -Bokeh: Create interactive web applications from Python without having to know Javascript, CSS, or HTML.</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng
recordid	cdi_dtic_stinet_AD1038470
source	DTIC Technical Reports
subjects	algorithms Big Data CLUSTERING computer program documentation computer programming Computer Programming and Software DATA VISUALIZATION high performance computing information systems InteractivE machine learning PYTHON PROGRAMMING LANGUAGE SOFTWARE TOOLS web applications workload
title	Working with and Visualizing Big Data Efficiently with Python for the DARPA XDATA Program
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T19%3A57%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-dtic_1RU&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=unknown&rft.btitle=Working%20with%20and%20Visualizing%20Big%20Data%20Efficiently%20with%20Python%20for%20the%20DARPA%20XDATA%20Program&rft.au=Oliphant,Travis&rft.aucorp=Continuum%20Analytics,%20Inc.%20Austin%20United%20States&rft.date=2017-08-01&rft_id=info:doi/&rft_dat=%3Cdtic_1RU%3EAD1038470%3C/dtic_1RU%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true