fusionDB

Access Online

Fusion is a method for classifying microorganisms based on their functional similarities. It was developed by Chengsheng Zhu at the Bromberg Lab, at Rutgers University, New Jersey and made publicly available by Yannick Malich.

Correctly identifying nearest “neighbors” of a given microorganism is important in industrial and clinical applications, where close relationships imply similar treatment. Today, prokaryotic taxonomy relies heavily on phylogenetics. However, evolutionary relatedness, inferred from phylogenetic markers, does not guarantee functional identity between members of the same taxon or lack of similarity between different taxa. Via comparison of all the molecular functions encoded in genomes of different microbes, we built Fusion, a novel microorganism classification network, in which two organisms (nodes) are connected with a edge (edge weight is their functional similarity). Fusion uses phenetic comparisons, providing a consistent and quantitative metric for classification. It is independent of the arbitrary pairwise organism similarity cutoffs traditionally applied to establish taxonomic identity. It is also more robust in dealing with data availability biases. Fusion defined organism clusters can be adjusted in size via resolution controls to meet specific research purposes. This dynamic feature of Fusion makes it capable of accommodating newly sequenced organisms. In addition, Fusion highlights the environmental factor for observed microorganism diversification with corresponding key functions. We believe Fusion will be a more practical choice for biomedical, industrial, and ecological applications, as many of these rely on understanding the functional capabilities of the microbes in their environment.

Fusiondb a novel database that uses our functional data to represent 1,374 taxonomically distinct bacteria annotated with available metadata: habitat/niche, preferred temperature, and oxygen use. Users can search fusionDB via combinations of organism names and metadata. Moreover, the web interface allows mapping new microbial genomes to the functional spectrum of reference bacteria, rendering interactive similarity networks that highlight shared functionality.

Fusion online data1 (organism similarity)

Fusion online data2 (module assignments)

The data and scripts below are provided for review purposes and are referenced in the submitted manuscript. These are sufficient to reproduce the Fusion network as described in the manuscript. See ReadMe.txt below for more detailed information. Contact czhu@bromberglab.org with any questions.

Protein sequences from 1374 bacteria genomes

Gephi input to produce Fusion

Mapping from the function clusters to protein GI IDs

Python2 scripts for data processing

ReadMe.txt

1374bacteria_list

1374bacteria_header

fusionDB example (Synechococcus bacterium, freshwater)

SOM

fusionDB ISMB2017 Supporting Online Material

fusionDB ISMB2017 Supporting Online Material - Table 3

fusionDB ISMB2017 Supporting Online Material - Table 4

fusionDB ISMB2017 Supporting Online Material - Table 5

Tags