Biological molecular function: methods and benchmarks for finding function in biological dark matter

Abstract

The accurate determination of biological molecular function remains one of the most significant challenges in computational biology, with vast areas of biological “dark matter” persisting in microbiomes, viruses, and unexplored sequence space. To meet this challenge, we developed at PSB session to address the limitations of traditional sequence similarity-based functional annotation methods and explores how recent advances in AI/ML and high-throughput data generation are transforming the field. We highlight four innovative contributions presented in this session: a geometric framework using signed distance functions for modeling protein surfaces; a reinforcement learning-based approach for steering protein generative models to design functional sequences; an ensemble framework combining sequence, structural, and network features for subcellular localization prediction; and a scalable factorization method integrating gene-gene interaction data for analyzing high-dimensional genetic perturbation profiles. Together, these methodologies showcase the potential for computational and AI-driven tools to address the complex and multiscale nature of molecular function prediction, paving the way for new discoveries in understanding and engineering biological systems.

Yana Bromberg
Yana Bromberg
Principal Investigator - Professor of Bioinformatics

My research focuses on deciphering the DNA blueprints of life’s molecular machinery

Related