Introduction to the Terra/AnVIL Cloud-based Genomics Platform
Sehyun Oh, Levi Waldron
The City University of New York
Abstract
The rapidly growing size of genomic datasets introduces challenges of data transfer, storage, access, sharing, and computing. In this workshop, we introduce cloud-based genomics platform Terra as a potential solution. Terra hosts large-scale genomic and genomic-related data sets and provide secure remote access to them and to individually uploaded data resources. Terra also provides on-demand computational capacity through Google Cloud Platform (GCP), and interactive analysis interfaces such as Jupyter notebook and RStudio. Users can run best practice tools and pipelines already implemented or upload their own data or analysis methods to workspaces. The Bioconductor AnVIL package and RunTerraWorkflow R package allow users to utilize GCP and Terra resources from local R session. Workshop participants will learn how to use Terra through the use-case examples, instructor-led live demonstration, and R/Bioconductor packages.
Keywords: Terra, AnVIL, Workflow, Cloud-based genomics platform, Cloud computing, Google Cloud Platform, Docker, API, RunTerraWorkflow