Package demo: Integrative genetic epidemiology with OpenGWAS, OpenCRAVAT, and Bioconductor

Integrative genetic epidemiology with OpenGWAS, OpenCRAVAT, and Bioconductor

Vincent James Carey Channing Division of Network Medicine, Harvard Medical School

Abstract

The interpretation of genetic variants is fundamental to all aspects of clinical genetics and genetic epidemiology. MRC OpenGWAS (publication https://doi.org/10.1101/2020.08.10.244293) is a data repository and API suite providing interactive access to statistics and metadata for hundreds of billions of human genetic variants. OpenCRAVAT (web site opencravat.org, publication DOI 10.1200/CCI.19.00132) is a system that amalgamates over 100 variant annotation resources and simplifies the development of rich characterizations of structural and functional contexts of genetic variants. Bioconductor (bioconductor.org) is an ecosystem of data structures and software packages that can be used in many contexts in genome biology and computational biomedicine. The gwaslake pkgdown site (vjcitn.github.io/gwaslake) shows how to adapt Bioconductor programming patterns and flexible containerization to simplify exploration, annotation, and interpretation of OpenGWAS variants assembled on a large collection of cohorts and phenotypes.

In this workshop, we will guide you through assembly of variants from diverse sources, in diverse formats, for flexible annotation using OpenCRAVAT within Rstudio. Bioconductor data structures and app designs are used to provide high-level conveniences for representation and analysis of cohorts arising in genetic epidemiology and cancer genomics.

Through live demonstrations and interactive small-group exercises, you will learn how to:

Import variation data from OpenGWAS and other human cohorts with Bioconductor, and discover public data on variants already available in the cloud

Use Bioconductor tools and Rstudio apps to configure and execute annotation task processes defined by OpenCRAVAT

Retrieve annotation reports generated by OpenCRAVAT, moving the results forward in interactive workflows addressing epidemiologic and clinical interpretation

The activities undertaken in this workshop will require some familiarity with Rstudio, but no programming per se will be necessary to take advantage of the workshop material.

Keywords: bioconductor,genetics,GWAS,PheWAS