Abstract 3207: GenePattern Notebook: An integrative analytical environment for cancer research

As the availability of genetic and genomic data and analysis tools from large-scale cancer initiatives continues to increase, with single-cell studies adding new dimensions to the potential scientific insights, the need has become more urgent for a software environment that supports the rapid pace o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Cancer research (Chicago, Ill.) Ill.), 2020-08, Vol.80 (16_Supplement), p.3207-3207
Hauptverfasser: Reich, Michael M., Tabor, Thorin, Liefeld, Ted, Juarez, Edwin, Hill, Barbara, Thorvaldsdottir, Helga, Tamayo, Pablo, Mesirov, Jill P.
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:As the availability of genetic and genomic data and analysis tools from large-scale cancer initiatives continues to increase, with single-cell studies adding new dimensions to the potential scientific insights, the need has become more urgent for a software environment that supports the rapid pace of cancer data science. The electronic analysis notebook has recently emerged as an effective and versatile tool for this purpose, allowing scientists to combine the scientific exposition – text, images, and multimedia – with the actual code that runs the analysis, creating a single “research narrative” document. The Jupyter Notebook system has become the de facto standard notebook environment in data science and genomic analysis. However, the Jupyter environment requires familiarity with a programming language to run analyses, and even text must be formatted using a programming-style language. To extend notebook capabilities to the needs of researchers at all levels of programming expertise, we developed the GenePattern Notebook environment, which integrates Jupyter's capabilities with the hundreds of genomic tools available through the GenePattern platform. This tool allows scientists to develop, share, collaborate on, and publish their notebooks, requiring only a web browser. In this environment, investigators can design their in-silico experiments, perform and refine analyses, launch compute-intensive analyses on cloud-based and high-performance compute resources, and publish their results as electronic notebooks that other scientists can adopt to reproduce the original analyses and modify for their own work. GenePattern Notebook provides: (1) Access to a wide range of genomic analyses within a notebook. Hundreds of analyses are available, from machine learning techniques such as clustering, classification, and dimension reduction, to omic-specific methods for gene expression analysis, proteomics, flow cytometry, sequence variation analysis, pathway analysis, and others. (2) A library of featured genomic analysis notebooks is provided. These include templates for common analysis tasks as well as cancer-specific research scenarios and compute-intensive methods. Scientists can easily copy these notebooks, use them as is, or adapt them for their research purposes. (3) Notebook enhancements. A rich text editor allows scientists to enter and format text as they would in a word processor. A user interface-building tool allows notebook developers to wrap their code
ISSN:0008-5472
1538-7445
DOI:10.1158/1538-7445.AM2020-3207