Posters

Poster: The R for Mass Spectrometry project

The R for Mass Spectrometry project Laurent Gatto,Sebastian Gibb,Johannes Rainer UCLouvain Abstract The aim of the R for Mass Spectrometry initiative is to provide efficient, thoroughly documented, tested, flexible and interoperable R software for the analysis and interpretation of high throughput mass spectrometry assays, including proteomics and metabolomics experiments. The project formalises the longtime collaborative efforts of the core members under the R for Mass Spectrometry organisation to facilitate dissemination and accessibility of their work and gather contributions from the wider community.

Continue reading

Poster: The Influence of Intra-scanner Variability on the Prediction of Human Papillomavirus (HPV) Association of Oropharyngeal Cancer (OPC) using CT derived Radiomic Features

The Influence of Intra-scanner Variability on the Prediction of Human Papillomavirus (HPV) Association of Oropharyngeal Cancer (OPC) using CT derived Radiomic Features Reza Reiazi,Collin Arrowsmith,Mattea Welch,Farnoosh abbas aghababazadeh,Christofer Eeles,Tony Tadic,Andrew J Hope,Scott V Bratman,Benjamin Haibe kains Princess Margaret Cancer Research Center, University Health Network Abstract Studies have shown that radiomic features are sensitive to the variability of imaging parameters (e.g., scanner model) and one of the major challenges in these studies lies in improving the robustness of quantitative features against the variations in imaging datasets from multi-center studies.

Continue reading

Poster: Statistical analysis of karyotypic variation from flow cytometry data

Statistical analysis of karyotypic variation from flow cytometry data Margot Henry,Aleeza Gerstein University of Manitoba Abstract Background: Karyotypic variation in ploidy and aneuploidy is observed in fungal microbial populations isolated from ecological, clinical, and industrial environments and is a hallmark of many types of cancer. In order to characterize and understand the dynamics of karyotype subpopulations, we require an unbiased computational method to identify different subpopulations and quantify the number of cells within them.

Continue reading

Poster: spatialHeatmap: Visualizing Spatial Assays in Anatomical Images

spatialHeatmap: Visualizing Spatial Assays in Anatomical Images Jianhai Zhang,Jordan Hayes,Le Zhang,Bing Yang,Wolf B Frommer,Julia Bailey-Serres,Thomas Girke University of California, Riverside Abstract Here we present spatialHeatmap, a new R/Bioconductor package that provides functionalities for visualizing cell-, tissue- and organ-specific data of biological assays by coloring the corresponding spatial features defined in anatomical images according to a numeric color key. The color scheme used to represent the assay values can be customized by the user.

Continue reading

Poster: Spatial Neighbor Models Applied to Genetic Regulatory Network Inference

Spatial Neighbor Models Applied to Genetic Regulatory Network Inference David S Burton,Matthew Nicholson McCall,Tanzy Love University of Rochester Abstract Though a wealth of gene expression data is now available, statistical methods which are able to model that data to infer network structure from gene product interactions are still lacking. We propose the application of spatial statistics models to infer network structure and functionality from gene expression datasets. Spatial models have several features which can be used to represent common network mechanisms such as cyclical behavior, decay of signal, mediated effects through parent nodes, as well as directionality of effect and parent-child relationships.

Continue reading

Poster: SeekerBio

SeekerBio Erick Cuevas Fernandez UNAM Abstract SeekerBio is a package under development that allows through the rsID of the SNPs obtained from GWAS, to obtain relevant information for each SNP, such as its location, gene, pathway, consequences, allelic frequency of all the populations of the 1000 GENOMAS project. All the information is delivered in data frame. It has a function to format the data and use it directly to make tidy models or use other machine learning algorithms.

Continue reading

Poster: scClassifR: Framework to accurately classify cell types in single-cell RNA-sequencing data

scClassifR: Framework to accurately classify cell types in single-cell RNA-sequencing data Thi-Tuong-Vy NGUYEN Medical University of Vienna Abstract Single-cell RNA-sequencing has become a key tool for biomedical research. One of the crucial steps in analyzing single-cell RNA-sequencing data is identifying the observed cell types. This allows one to reveal the heterogeneity within tissues at an unprecedented level of detail. Many computational methods were developed to automate this task. The majority relies on annotated reference datasets.

Continue reading

Poster: sangeranalyseR: simple and interactive processing of Sanger sequencing data in R

sangeranalyseR: simple and interactive processing of Sanger sequencing data in R Kuan-Hao Chao,Kirston Barton,Sarah Palmer,Robert Lanfear Australian National University Abstract sangeranalyseR is feature-rich, free, and open-source R package for processing Sanger sequencing data. It allows users to go from loading reads to saving aligned contigs in a few lines of R code by using sensible defaults for most actions. It also provides complete flexibility for determining how individual reads and contigs are processed, both at the command-line in R and via interactive Shiny applications.

Continue reading

Poster: SAlinR: a versatile R/Bioconductor package for sequence alignment and feature analysis

SAlinR: a versatile R/Bioconductor package for sequence alignment and feature analysis Haowei Zhang,Weijie Liao Tsinghua University, college of life science Abstract Sequence Alignment is the essential tactic in bioinformatics, as many individual tools have been built, but less in R. Here, we developed one R package SAlignR to perform Sequence analysis such as multiple sequence alignment, motif discovery, seqlogo and visualization. We integrated almost multiple sequence alignment algorithm like ClustalW, MUSCLE and T-Coffee both for nucleotide and amino acid sequence.

Continue reading

Poster: Robust Concordance Index: a Metric for Association Testing for Preclinical Biomarker Discovery

Robust Concordance Index: a Metric for Association Testing for Preclinical Biomarker Discovery Ian Smith,Petr Smirnov,Benjamin Haibe-Kains University Health Network, University of Toronto Abstract Datasets in biological research often suffer from the dual problems of considerable systematic and statistical noise and limited sample size. This presents challenges in identifying associations between biological features and response variables of interest, as in the context of identifying biomarkers for sensitivity to therapy. We introduce the Robust Concordance Index (rCI), a modification to the standard Concordance Index (or Kendall’s Tau) to address these limitations.

Continue reading

Poster: Prolfqua - R package for visualization and modelling of proteomics label-free quantification data

Prolfqua - R package for visualization and modelling of proteomics label-free quantification data Witold Eryk Wolski,Christian Panse FGCZ Abstract The R-package for \textbf{pro}teomics \textbf{l}abel \textbf{f}ree \textbf{qua}ntification \texttt{prolfqua} (read: prolewka) evolved from functions and code snippets used to visualize and analyze label-free quantification data. To compute protein fold changes among treatment conditions, we first used t-test, linear models, or functions implemented in the package limma. We evaluated \href{10.18129/B9.bioc.MSstats}{MSStats}, \href{10.1038/s41598-017-05949-y}{ROPECA} or \href{https://github.com/statOmics/MSqRob}{MSqRob} all implemented in R, with the idea to integrate the various approaches in our analysis pipeline.

Continue reading

Poster: Predicative modeling using genome-wide DNA methylation data

Predicative modeling using genome-wide DNA methylation data Lily Wang,Lanyu Zhang,Gabriel Odom,Lizhong Liu,Tiago Chedraoui Silva University of Miami Abstract In the search of predicative methylation signatures for complex diseases, previous studies have highlighted valuable DNA methylation-based biomarkers. However, almost all of these studies have built prediction models based on single CpGs, without considering methylation status of neighboring CpG sites. Compared with single CpGs, differentially methylated regions (DMRs) give higher confidence and likelihood of biological importance.

Continue reading

Poster: PhosR enables processing and functional analysis of phosphoproteomic data

PhosR enables processing and functional analysis of phosphoproteomic data Taiyun Kim,Hani Jieun Kim The Univsersity of Sydney Abstract Mass spectrometry (MS)-based phosphoproteomics has revolutionized our ability to profile phosphorylation-based signaling in cells and tissues on a global scale. To infer the action of kinases and signaling pathways in phosphoproteomic experiments, we present PhosR, a set of tools and methodologies implemented in a suite of R packages facilitating comprehensive analysis of phosphoproteomic data.

Continue reading

Poster: MSImpute: Imputation of label-free mass spectrometry peptides by low-rank approximation

MSImpute: Imputation of label-free mass spectrometry peptides by low-rank approximation Soroor Hediyeh-zadeh,Andrew Webb,Melissa Davis Department of Medical Biology, The University of Melbourne; Colonial Foundation Healthy Ageing Centre, WEHI. Abstract Recent developments in mass spectrometry (MS) instruments and data acquisition modes have aided multiplexed, fast, reproducible and quantitative analysis of proteome profiles, yet missing values remain a formidable challenge for proteomics data analysis. The stochastic nature of sampling in Data Dependent Acquisition (DDA), suboptimal preprocessing of Data Independent Acquisition (DIA) runs and dynamic range limitation of MS instruments impedes the reproducibility and accuracy of peptide quantification and can introduce systematic patterns of missingness that impact downstream analyses.

Continue reading

Poster: MolEvolvR: Web-app and R-package for characterizing proteins using molecular evolution and phylogeny

MolEvolvR: Web-app and R-package for characterizing proteins using molecular evolution and phylogeny Samuel Zorn Chen,Lauren Marie Sosinski,John Bradley Johnson,Janani Ravi Michigan State University Abstract Molecular evolution and phylogeny can provide key insights into pathogenic protein families. Studying how these proteins evolve across bacterial lineages can help identify lineage-specific and pathogen-specific signatures and variants, and consequently, their functions. We have developed a streamlined computational approach for characterizing the molecular evolution and phylogeny of target proteins, widely applicable across proteins and species of interest.

Continue reading

Poster: MethReg: estimating the regulatory potential of DNA methylation in gene transcription

MethReg: estimating the regulatory potential of DNA methylation in gene transcription Tiago Chedraoui Silva,Juan I. Young,Eden R. Martin,Xi Chen,Lily Wang University of Miami Abstract Epigenome-wide association studies (EWAS) often detect a large number of differentially methylated sites or regions, many are located in distal regulatory regions. To further prioritize these significant sites, there is a critical need to better understand the functional impact of CpG methylation. Recent studies demonstrated CpG methylation-dependent transcriptional regulation is a widespread phenomenon.

Continue reading

Poster: Maximum rank reproducibility

Maximum rank reproducibility Tusharkanti Ghosh University of Colorado, Anschutz Medical Campus Abstract marr (Maximum Rank Reproducibility) is a nonparametric approach that detects reproducible signals using a maximal rank statistic for high-dimensional biological data. In this R package, we implement functions that measures the reproducibility of features per sample pair and sample pairs per feature in high-dimensional biological replicate experiments. The user-friendly plot functions in this package also plot histograms of the reproducibility of features per sample pair and sample pairs per feature.

Continue reading

Poster: MatrixQCvis - shiny-based interactive data quality exploration for omics data

MatrixQCvis - shiny-based interactive data quality exploration for omics data Thomas Naake European Molecular Biology Laboratory Heidelberg Abstract Exploratory data analysis and data quality assessment are integral parts of any end-to-end data analysis workflow. We present the MatrixQCvis package, which provides shiny-based interactive visualization of data quality metrics at the per-sample and per-feature level. It is broadly applicable to quantitative omics data types that come in matrix-like format (features x samples).

Continue reading

Poster: Make Intearctive Complex Heatmaps

Make Intearctive Complex Heatmaps Zuguang Gu German Cancer Research Center Abstract Heatmap is a powerful visualization method on two-dimensional data to reveal patterns shared by subsets of rows and columns. In R, there are many packages that make heatmaps. Among them, ComplexHeatmap provides rich tools for constructing highly customizable heatmaps. It can easily establish connections between information from multiple sources by automatically concatenating and adjusting multiple heatmaps and complex annotations, which makes it widely applied in data analysis in various fields, especially in Bioinformatics.

Continue reading

Poster: Integrating long read RNA-seq data into BioConductor workflows with NanoporeRNASeq

Integrating long read RNA-seq data into BioConductor workflows with NanoporeRNASeq Yuk Kei Wan,Ying Chen,Jonathan Goeke Genome Institute of Singapore Abstract The NanoporeRNASeq package is the first long-read RNA-seq ExperimentData package available on Bioconductor. The data was generated from Oxford Nanopore Sequencing and consists of six samples from two human cell lines, namely K562 and MCF7, each with three replicates (one from direct RNA sequencing and two from direct cDNA sequencing). The six samples in this package are aligned to chromosome 22 (Grch38) in bam file format and are a subset of samples from the Singapore Nanopore Expression Consortium (SG-NEx), which provides comprehensively benchmarked datasets from three Nanopore long-read RNA-seq protocols (direct RNA, direct cDNA, and PCR cDNA) being run on five cancer cell lines (A549, Hct116, HepG2, K562, and MCF7).

Continue reading

Poster: Instant In-Memory Snapshot and Restore for Faster Single-Cell RNA-seq Data Analysis

Instant In-Memory Snapshot and Restore for Faster Single-Cell RNA-seq Data Analysis Yue Li MemVerge Inc. Abstract A typical single-cell RNA-seq data analysis pipeline consists of many data processing stages. These stages could be very long-running and are also resource-intensive. This is because many of the stages involve computations using very large matrices based on a large number of input RNA samples. Furthermore, the stages are I/O intensive as each stage typically involves saving and restoring intermediate results for tuning, compliance, and experimental reproducibility.

Continue reading

Poster: Identification of novel cellular states and therapeutic targets in PDAC with machine learning

Identification of novel cellular states and therapeutic targets in PDAC with machine learning Chengxin Yu Lunenfeld-Tanenbaum Research Institute; University of of Toronto Abstract The 8% 5-year survival rate of pancreatic ductal adenocarcinoma (PDAC) leads to an urgent need for novel therapies. An attractive target is tumour associated stromal cells (TAS) that are involved in PDAC progression and immunosuppression. However, an exact atlas of TAS states in PDAC and their interactions with tumour cells that would identify novel drug targets is yet to be uncovered.

Continue reading

Poster: HPAStainR: a Bioconductor and Shiny app to query protein expression patterns in the Human Protein Atlas

HPAStainR: a Bioconductor and Shiny app to query protein expression patterns in the Human Protein Atlas Tim Oldfield Nieuwenhuis,Marc K Halushka Johns Hopkins School of Medicine Abstract The Human Protein Atlas is a website of protein expression in human tissues. It is an excellent resource of tissue and cell type protein localization, but only allows the query of a single protein at a time. We introduce HPAStainR as a new Shiny app and Bioconductor/R package used to query the scored staining patterns in the Human Protein Atlas with multiple proteins/genes of interest.

Continue reading

Poster: Genome-wide discovery of natural variation in pre-mRNA splicing and prioritising causal alternative splicing to salt stress response in rice

Genome-wide discovery of natural variation in pre-mRNA splicing and prioritising causal alternative splicing to salt stress response in rice Huihui Yu,Qian Du,Malachy Campbell,Bin Yu,Harkamal Walia,Chi Zhang School of Biological Sciences, University of Nebraska, Lincoln, NE, 68588, USA. Abstract Pre-mRNA splicing is an essential step for the regulation of gene expression. In order to specifically capture splicing variants in plants for genome-wide association studies (GWAS), we developed a software tool to quantify and visualise Variations of Splicing in Population (VaSP).

Continue reading

Poster: Generating genomic null ranges via covariate matched sampling and its application to understand enhancer-promoter connectivity

Generating genomic null ranges via covariate matched sampling and its application to understand enhancer-promoter connectivity Eric Scott Davis,Wancen Mu,Mikhail Dozmorov,Stuart Lee,Michael I Love,Douglas Phanstiel University of North Carolina at Chapel Hill Abstract Introduction: Statistical evaluation of genomic features often assumes they are randomly distributed across the genome. However, many genomic features are correlated and break this assumption. Therefore, it is important to construct an appropriate null model accounting for these covariates.

Continue reading

Poster: CytoTree: an R/Bioconductor package for analysis and visualization of flow and mass cytometry data

CytoTree: an R/Bioconductor package for analysis and visualization of flow and mass cytometry data Yuting Dai Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine at Shanghai, Ruijin Hospital Affiliated to Shanghai Jiao Tong University School of Medicine and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai 200025, China. Abstract Background: The rapidly increasing dimensionality and throughput of flow and mass cytometry data necessitate new bioinformatics tools for analysis and interpretation, and the recently emerging single-cell-based algorithms provide a powerful strategy to meet this challenge.

Continue reading

Poster: covid19census: U.S. and Italy COVID-19 epidemiological data and demographic and health related metrics

covid19census: U.S. and Italy COVID-19 epidemiological data and demographic and health related metrics Claudio Zanettini,Mohamed Omar,Wikum Dinalankara,Eddie Luidy Imada,Elizabeth Colantuoni,Giovanni Parmigiani,Luigi Marchionni Department of Pathology and Laboratory Medicinel, Weill Cornell Medicine, New York, NY Abstract Since the beginning of the COVID-19 pandemic in 2020, there has been a tremendous accumulation of data capturing different statistics including the number of tests, confirmed cases and deaths. This data wealth offers a great opportunity for researchers to model the effect of certain variables on COVID-19 morbidity and mortality and to get a better understanding of the disease at the epidemiological level.

Continue reading

Poster: Comprehensive generation, visualization, and reporting of quality control metrics for single-cell RNA sequencing data

Comprehensive generation, visualization, and reporting of quality control metrics for single-cell RNA sequencing data Rui Hong,Yusuke Koga,Shruthi Bandyadka,Anastasia Leshchyk,Yichen Wang,Vidya Akavoor,Xinyun Cao,Irzam Sarfraz,Zhe Wang,Salam Alabdullatif,Frederick Jansen,Masanao Yajima,William Evan Johnson,Joshua D. Campbell Boston University School of Medicine Abstract Single-cell RNA sequencing (scRNA-seq) can be used to gain insights into cellular heterogeneity within complex tissues. However, a variety of technical artifacts can be present in scRNA-seq data and need to be assessed before downstream analyses can be performed.

Continue reading

Poster: CNVMetrics package: quantifying similarity between copy number profiles

CNVMetrics package: quantifying similarity between copy number profiles Pascal Belleau,Astrid Deschênes,Semir Beyaz,David A Tuveson,Alexander Krasnitz Cold Spring Harbor Laboratory Abstract Genome-wide DNA copy number profiles are an informative type of molecular data that are exploited in numerous areas of genomic analysis and can be derived from a variety of platforms, including microarrays and next-generation DNA sequencing. Copy-number variants (CNVs) have been shown to be associated with a wide spectrum of pathological conditions and complex traits, such as developmental neuropsychiatric disorders.

Continue reading

Poster: CNVgears: Integrating and Analysing CNV Calling Results from Multiple Methods in a Uniformed and Standardized Framework

CNVgears: Integrating and Analysing CNV Calling Results from Multiple Methods in a Uniformed and Standardized Framework Simone Montalbano,Enrico Domenici,Michele Filosi Laboratory of Neurogenomic Biomarkers, CIBIO, University of Trento Abstract Copy Number Variation (CNVs) are a form of genomic variation known to be very relevant in several psychiatric and neurodevelopmental diseases, such as schizophrenia and autism spectrum disorder (ASD). Several methods exist to perform CNVs calling on both SNP arrays and NGS data, as well as pipelines integrating more than one algorithm.

Continue reading

Poster: cfDNAPro: an R/Bioconductor package to characterise and visualise cell-free DNA biological features in liquid biopsy

cfDNAPro: an R/Bioconductor package to characterise and visualise cell-free DNA biological features in liquid biopsy Haichao Wang,Nitzan Rosenfeld,Hui Zhao,Christopher G. Smith Cancer Research UK Cambridge Institute, University of Cambridge Abstract Cell-free (cf) DNA enters human blood circulation by various biological processes. There is upsurging evidence that cfDNA biological features could be exploited to support sensitive cancer detection, treatment selection and minimal residual diseases detection. However, there are currently no R packages designed for automated analysis of cfDNA biological features such as fragment size, periodicity, nucleotide frequency, nucleosome occupancy etc.

Continue reading

Poster: BRAIN-UMAP: The genetic intersection between neuroscience, neurology, psychiatry, and oncology

BRAIN-UMAP: The genetic intersection between neuroscience, neurology, psychiatry, and oncology Sonali Arora Fred Hutch Cancer Research Center, Seattle, WA, USA Abstract Whole transcriptome sequencing (RNA-seq) is an important tool for understanding genetic mechanisms underlying human diseases and gaining a better insight into complex human diseases. Several ground-breaking projects have uniformly processed RNASeq data from publicly available studies to enable cross-comparison. One noteworthy study is the recount2 pipeline, which in 2017, has reprocessed ~70,0000 samples from Short Read Archive(SRA), The Cancer Genome Atlas (TCGA), and Genotype-Tissue Expression (GTEx).

Continue reading

Poster: bio_embeddings: Making Protein Language Models Accessible to the Wider Research Community

bio_embeddings: Making Protein Language Models Accessible to the Wider Research Community Konstantin Schütze,Christian Dallago,Michael Heinzinger,Burkhard Rost TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany Abstract Recently, Language Models (LMs) have been adapted from use in natural language to work with protein sequences instead. Protein LMs show enormous potential in generating descriptive vector representations (embeddings) for proteins from just their sequences at a fraction of the time compared to previous approaches.

Continue reading

Poster: Assessment of statistical methods from single cell, bulk RNA-seq, and metagenomics applied to microbiome data

Assessment of statistical methods from single cell, bulk RNA-seq, and metagenomics applied to microbiome data Matteo Calgaro,Chiara Romualdi,Levi Waldron,Davide Risso,Nicola Vitulo University of Verona Abstract Background: The correct identification of differentially abundant microbial taxa between experimental conditions is a methodological and computational challenge. Recent work has produced methods to deal with the high sparsity and compositionality characteristic of microbiome data, but independent benchmarks comparing these to alternatives developed for RNA-seq data analysis are lacking.

Continue reading

Poster: Analysis of structural mass spectrometry in R

Analysis of structural mass spectrometry in R Oliver Crook University of Oxford Abstract Hydrogen deuterium exchange mass spectrometry is a key structural mass spectrometry technique used in academic and industry to probe differential protein structures. These experiments are going through vast increasing in throughput due to the use of liquid handling robots and more sensitive instrumentation. However, this key method, in antibody and small molecule research, is complicated by several issues.

Continue reading

Poster: An alternative to manual literature review: a package for exploring biomedical concept graphs

An alternative to manual literature review: a package for exploring biomedical concept graphs Leslie Myint Macalester College Abstract Within the biomedical sciences, both contextual sensemaking of results and hypothesis generation require strong knowledge of how many different concepts and physical entities are connected. Traditionally, these endeavors have required investigators to manually mine the vast amounts of information contained in research papers. The sheer volume of accumulated knowledge poses problems for both the breadth and depth of literature review.

Continue reading

Poster: A Bayesian mixture modelling approach for simultaneous batch correction seropositivity estimation in ELISA data

A Bayesian mixture modelling approach for simultaneous batch correction seropositivity estimation in ELISA data Stephen Coleman,Xaquin Castro Dopico,Gunilla Karlsson Hedestam,Paul D. W. Kirk,Chris Wallace MRC Biostatistics Unit, University of Cambridge; Karolinska Institute Abstract A prevalent issue in biological data is the structural differences between different batches of samples that may relate to differences in laboratory, sampling or storage conditions over time or locations. The problems these batch effects present has been well studied in high-dimensional data such as RNA-seq, but the problem is present in any setting where multiple batches are run.

Continue reading