Establishing individual disease propensity from exome data
Every individual is genetically predisposed to а number of disorders that could significantly affect their span or quality of life. We are working to elucidate the genetically encoded molecular mechanisms of pathogenicity, improving diagnostic practice and facilitating prevention and treatment. The specific objective of this project is to develop a computational method for annotating disease predisposition from genome variation data (AVA,Dx – Analysis of Variation for Association with Disease). Our approach of searching for non-linear disease signature patterns in genome variation data is particularly meaningful for complex disorders that are lacking established genetic markers. As proof of concept, we are currently building instances of AVA,Dx using exome data of individuals affected by Crohn’s Disease (CD in collaboration with Dr. Andre Franke, Kiel University), chronic obstructive pulmonary disease (COPD; in collaboration with Dr. Toru Nyunoya, University of New Mexico), and Tourette Disorder (in collaboration with Dr. Jay Tischfield, Rutgers University). Our method will highlight pathogenesis genes and pathways for further experimental follow-up and, potentially, be able to determine individual predisposition to a range of disorders, improving clinical diagnostic time and accuracy. The first, much simplified, AVA,Dx prototype has outperformed all other groups in identifying CD exomes in the 2011 Critical Assessment of Genome Interpretation (CAGI) experiment. The current version is undergoing further testing, but is showing promising results. We expect that, AVA,Dx will easily, cheaply, and accurately contribute both to further research and immediate medical care.
We are looking for interested experimental labs with diabetes and multiple sclerosis interests to provide more data for analysis. Please contact Yana Bromberg for further information.
This project is funded in part by the PhRMA foundation.
Relevant Publications include:
Mapping the evolution of biological electron-transfer
Biological redox reactions are a crucial component of biogeochemical cycles of our planet. We are analyzing the structural motifs of oxidoreductases, the key protein catalysts of these reactions, to produce an evolutionary blueprint for the electronic circuit of life. The ultimate goal of this research is to develop an annotated map of the interlinked, redox coupled metabolic pathways on Earth. Such a map would not only alter our understanding of how biologically catalyzed electron transfer reactions evolved, but also facilitate the design of bio-inspired catalysts and synthetic biological systems. The specific goal of this project is to derive the structural motifs responsible for redox reactions using function informed analysis of structural alignments.
This project is funded by the Gordon and Betty Moore Foundation and is being done in collaboration with Dr. Paul Falkowski, Dr. Vikas Nanda, Dr. Nathan Yee, Dr. Debashish Bhattacharya, Dr. David Case, and Dr. Max Haggblom.
Relevant Publications include:
Evolutionary history of redox metal-binding domains across the tree of life. Harel, A., Bromberg, Y., Falkowski, P.G., and Bhattacharya, D. Proc Natl Acad Sci U S A 111, 7042-7047 PMID: 24778258
Function-based assessment of structural similarity measurements using metal co-factor orientation. Senn, S., Nanda, V., Falkowski, P., and Bromberg, Y. Proteins 82, 648-656 PMID: 24127252
TrAnsFuSE refines the search for protein function: oxidoreductases. Harel A, Falkowski P, Bromberg Y. Integr Biol (Camb). 2012 Apr 5. [Epub ahead of print] PMID: 22481248 [PubMed - as supplied by publisher]
Tracing evolution of secretion systems and prokaryotic interactions.
Interactions between organisms occupying the same environmental niche are facilitated via secreted proteins and peptides. Six secretion systems have been identified in pathogenic and endosymbiotic Gram-negative bacteria. The type III (T3) secretion system comprises a hollow needle-like structure localized on the surface of bacterial cells that injects specific bacterial proteins, effectors, directly into the cytoplasm of a host cell. During infection, effectors act in concert to convert host resources to the invader’s advantage and to promote pathogenicity. Advances in sequencing technologies produce an ever-growing number of bacterial genome sequences. As a result, the identification of bacterial type III effectors has shifted away from experimental discovery of individual proteins to whole genome computational screenings in a search for effector coding genes. We built pEffect, a method that predicts T3 effector proteins from features of the entire amino acid sequence. It combines homology-based inference with machine learning (support vector machine, SVM)-based de-novo predictions and reaches 87% precision at 95% recall on a non-redundant test set, outperforming all other available methods. We are currently studying evolution of T3 secretion across life and would like to trace the (likely) re-use of this machinery in non-secreting organisms.
Function-based classification for a better understanding of the microbial world.
We ask the question: how subjective is the current prokaryotic taxonomy? How does it correlate with the molecular functionality of the organisms it classifies? And, finally, can we do better? These are important questions to answer as we enter the world of microbiome/metagenome analysis for clinical/industrial uses, still wielding our 16S rRNA databases (and little else) as the primary source of annotation. The current taxonomic structure is deeply influenced by the hierarchical nature of “modification with descent” phylogenies. As horizontal gene transfer is widespread in prokaryotes, defining classes of organisms on the basis of standard evolutionary relationship metrics is a severely near-sighted approach. Thousands of prokaryotic genomes have been sequenced, providing the data necessary to classify microorganisms using whole genome content.
We built fusion (functional similarity-based organism network) and clustered organisms in this network. These organism clusters are “naturally” defined groups, with more functional similarity between organisms within a group than outside it. fusion uses phenetic comparisons of genome-encoded molecular functionalities and, thus, provides a complementary view to taxonomic clade assignment and a consistent and quantitative metric for organism classification. Using fusion, we quantified functional diversity that exists at all taxonomic levels. By comparing the discrepancies in taxonomic annotations of organisms with fusion assignments we were able to identify the environmental factors driving these differences.
We are currently aiming to identify sets of signature functions that represent individual fusion modules. We are also interested in using our definition of functional clusters to (1) identify of the likely participants of molecular pathways, (2) trace of evolution of such pathways, and (3) establish environment-specific pathways.
The work on this project is supported by Rutgers start-up funds, the Gordon and Betty Moore Foundation, the USDA-NIFA, and the Technische Universitat Munchen - Institute for Advanced Study Hans Fischer Fellowship, the German Excellence Initiative and the European Union Seventh Framework Programme and is being done in collaboration with Tom O. Delmont and Timothy M. Vogel.
Relevant Publications include:
Functional Basis of Microorganism Classification. Zhu C, Delmont TO, Vogel TM, Bromberg Y (2015) PLOS Computational Biology 11(8): e1004472.
Evaluating microbiome interactions and emergent functionality
How does the environment drive microbiome consolidation into a functional unit? Understanding the molecular functions encoded in the metagenomes of microbiomes is vital for the analysis of their behavior and, potentially, synthetic function optimization. The recent emergence of high-throughput genomic sequencing, coupled with the growing analytical capacities, has unlocked new horizons in our understanding of the microbial world. There are currently over 80,000 sequenced metagenomic samples in the public domain. However, making sense of this deluge of data requires efficient and accurate computational techniques. The identification of microbial clades resident in a particular environmental niche is only an estimate of the microbiome’s functional potential. Instead, our function-based approach can be applied to microbiome analysis to facilitate assessment of functional diversity instead of somewhat arbitrary organism clade counts.
We developed a sequence alignment-based means of identifying the molecular functions encoded by the microbiomial metagenomes by mapping raw genetic reads to the functions of their corresponding “parent” genes (mi-faser). We will further estimate the microbiome diversity by mapping the functions identified from metagenome reads onto fusion. We expect that our approach will recapitulate the genomes that can be assembled, as well as an additional large set of organisms present in the microbiome, but not identifiable with current techniques.