CLubbeR is an automated cluster load balancing system designed specifically to facilitate and accelerate common computational biology experimental workflows and used in conjunction with existing methods or scripts to efficiently process large-scale datasets. The method was developed by Maximillian Miller (BrombergLabRutgers University and Rostlab @ Technical University of Munich).

If you find CLubbeR useful please cite:

Miller, Maximilian, Chengsheng Zhu, and Yana Bromberg. "clubber: removing the bioinformatics bottleneck in big data analyses." Journal of integrative bioinformatics 14.2 (2017).

GIT repository containing all sources

Docker container for clubber in the Docker Store

The fastest and easiest way to use clubber is to simply run the bromberglab/clubber:latest docker image from within docker. Which will be automatically retrieved from the docker cloud if not available locally:

docker run -d -p 80:80 bromberglab/clubber:latest

CLubbeR's plug-and-play framework encourages re-use of existing solutions for bioinformatics problems. CLubbeR’s goals are to reduce computation times and to facilitate use of cluster computing. The first goal is achieved by automating the balance of parallel submissions across available high performance computing (HPC) resources. Notably, the latter can be added on demand, including cloud-based resources, and/or featuring heterogeneous environments. The second goal of making HPCs user-friendly is facilitated by an interactive web interface and a RESTful API allowing for job monitoring and result retrieval. We used CLubbeR to speed up our pipeline for annotating molecular functionality of metagenomes. We analyzed the Deepwater Horizon oil-spill study data to quantitatively show that that the beach sands have not yet entirely recovered. Further, our analysis of the CAMI-challenge data revealed that microbiome taxonomic shifts do not necessarily correlate with functional shifts. These examples (21 metagenomes processed in 172 minutes) clearly illustrate the importance of clubber in the everyday computational biology environment.