Poster: BRAIN-UMAP: The genetic intersection between neuroscience, neurology, psychiatry, and oncology

BRAIN-UMAP: The genetic intersection between neuroscience, neurology, psychiatry, and oncology

Sonali Arora Fred Hutch Cancer Research Center, Seattle, WA, USA

Abstract

Whole transcriptome sequencing (RNA-seq) is an important tool for understanding genetic mechanisms underlying human diseases and gaining a better insight into complex human diseases. Several ground-breaking projects have uniformly processed RNASeq data from publicly available studies to enable cross-comparison. One noteworthy study is the recount2 pipeline, which in 2017, has reprocessed ~70,0000 samples from Short Read Archive(SRA), The Cancer Genome Atlas (TCGA), and Genotype-Tissue Expression (GTEx). This vast dataset also includes gene expression data for GTEx-defined brain regions, neurological and psychiatric disorders (such as Parkinson’s, Alzheimer’s, Huntington’s) and gliomas (such as TCGA, Chinese Glioma Genome Atlas (CGGA)). We apply uniform manifold approximation and projection (UMAP), a non-linear dimension reduction tool, to bulk gene expression data from brain-related diseases to build a BRAIN-UMAP , where we visualize that gene expression profiles of neurological and psychiatric diseases are similar to GTEX-defined brain regions, compared to gliomas and exhibit tissue-specific profiles and patterns . Incorporating gliomas from various publicly available datasets also allows for the ability to observe unique clustering, which can increase our understanding of the disease. We also present a resource where researchers interested in mechanisms, can easily compare, and contrast the expression of a given gene and/or pathway of interest across various diseases, gliomas, and normal brain. This study shows the wealth of knowledge that can be obtained by combining and studying various uniformly processed publicly available datasets to study a single organ and utilizing 3D visualization to explore similarities and differences between samples. Our current study, focusing on brain related diseases, offers insight into what may be possible for the broader neuroscientific community if we continually reprocess newly available brain related RNASeq samples using recount2. Additionally, if we build similar uniformly processing pipelines for other kinds of next-generation sequencing data, we would be able to use multi-omic sequencing data to find novel associations between biological entities and increase our mechanistic knowledge of the disease.

Keywords: NA