Anvil Project

The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-Space (AnVIL) is a project to build a data commons to allow researchers to efficiently analyze and visualize genomics data on the cloud. This project is a collaboration with Oregon Health and Science University, University of California Santa Cruz, University of Chicago, The Broad Institute, Washington University in St. Louis, Vanderbilt University Medical Center, Johns Hopkins University and Pennsylvania State University.

Galaxy Project

Galaxy is a scientific analysis workbench used by thousands of scientists worldwide to analyze genomic, proteomic, imaging, and other large biomedical datasets. Galaxy’s user-friendly, web-based interface makes it possible for anyone, regardless of their informatics expertise, to create, run, and share large-scale robust and reproducible analyses. Galaxy accelerates biomedical research by bringing together tool developers and end users such as bench scientists and physician-researchers. There are more than 5,000 analysis tools available in Galaxy’s ToolShed, and users run more than 200,000 analyses each month on Galaxy’s main public server. OHSU’s precision cancer medicine programs use Galaxy to run clinical and research genomics analyses as well as machine learning workflows. Galaxy is funded by both NIH and NSF.

Find additional information on this project 

This is a joint project in collaboration with the Nekrutenko Lab and Taylor Lab .

GDAN Project

The Genomic Data Analysis Network (GDAN) is a network of individuals and institutions that produce the bulk of the analysis required to interpret the data generated by the Genomic Characterization Centers (GCCs), that is housed within the GDC.

What is the SMMART program?

SMMART stands for Serial Measurements of Molecular and Architectural Responses to Therapy. It is the flagship project of the Knight Cancer Institute’s new Precision Oncology program.

The goal of the SMMART program is to develop new treatments for cancer that last longer (are more durable) and allow better quality of life (are more tolerable) for patients with advanced disease. 

In particular, the goal is to understand why chemotherapies often stop working, and to develop new treatments that will stop cancers from becoming resistant to cancer drugs.


For more information please email

G2P is an open, aggregate public clinical cancer knowledge base for storing and searching connections between genomic biomarkers (“genotypes”) and patient diagnosis, prognosis, and response to treatment (“phenotypes”). Key uses of G2P include (a) searching by somatic variant to find drugs known to lead to response or resistance in tumors with the variant; (b) searching by drug to identify different mutations in which it can lead to response; (c) searching clinical trials to find those associated with particular biomarkers or drugs. G2P combines biomarker-phenotype associations from 9 trusted and curated knowledge bases, including CIViCOncoKBPMKBJAX CKB and the Cancer Genome Interpreter. Clinical trials data is also included from several sources as well. Users can perform full-text search on G2P and filter results using a web portal with intuitive visualizations. Code 

We are developing data analysis methods and data management software to store, analyze, and integrate clinical, imaging, and molecular data for (1) treating cancer using precision therapies adapted over time; and (2) discovering and understanding mechanisms of resistance in cancer. This initiative brings together and advances many areas, including (a) development of computational analysis workflows to identify key biomarkers such as somatic mutations, gene expression, pathway activity, and tumor composition; (b) using public datasets in genomics, transcriptomics, and biological pathways together with patient data to correlate biomarkers with prognosis and predict therapeutic response; and (c) producing patient reports and interactive visualizations that provide precision therapy recommendations based on consensus amongst methods and enable differential analysis across timepoints. Key software used in this work includes LabKey for data management and visualization, G2P for finding key biological and clinically actionable biomarkers, and Galaxy for analysis workflow creation and execution. 


G-OnRamp is a collaboration between two successful and long-running projects — the Genomics Education Partnership (GEP) and the Galaxy Project. G-OnRamp provides biologists with an integrated, web-based, scalable environment for interactive annotation of eukaryotic genomes using large genomic datasets. It also provides educators with a platform to help undergraduates develop “big data” science skills through eukaryotic genome annotation. GEP is a consortium of over 100 colleges and universities that provides Classroom Undergraduate Research Experiences (CURE) in bioinformatics/genomics for students at all levels. G-OnRamp extends Galaxy with tools and workflows that creates UCSC Assembly Hubs and Apollo/JBrowse genome browsers with evidence tracks for sequence similarity, ab initio gene predictions, RNA-Seq, and repeats. Educators can use this system to design CUREs based on their favorite eukaryotic species (e.g., parasitoid wasps). G-OnRamp provides a VirtualBox virtual appliance and an AMI image for local and cloud (Amazon EC2) deployments. G-OnRamp is supported by the NIH.

Our lab develops frameworks and applications for doing interactive visual analysis on the Web. Visual analysis combines visualization with analysis tools & pipelines so that visual inspection can be used to guide tool & pipeline usage. One aspect of this work is enabling visualization of very large genomic datasets on the Web, and another aspect is integrating visualizations, tools, and pipelines in a meaningful way.