<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>uncertainty | BrombergLab</title>
    <link>https://bromberglab.org/tag/uncertainty/</link>
      <atom:link href="https://bromberglab.org/tag/uncertainty/index.xml" rel="self" type="application/rss+xml" />
    <description>uncertainty</description>
    <generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><copyright>© BrombergLab 2026</copyright><lastBuildDate>Mon, 08 Jun 2026 00:00:00 +0000</lastBuildDate>
    <image>
      <url>https://bromberglab.org/media/logo.png</url>
      <title>uncertainty</title>
      <link>https://bromberglab.org/tag/uncertainty/</link>
    </image>
    
    <item>
      <title>ProtTale</title>
      <link>https://bromberglab.org/project/prottale/</link>
      <pubDate>Mon, 08 Jun 2026 00:00:00 +0000</pubDate>
      <guid>https://bromberglab.org/project/prottale/</guid>
      <description>&lt;h2 id=&#34;reliability-aware-generative-annotation-of-protein-function&#34;&gt;Reliability-aware Generative Annotation of Protein Function&lt;/h2&gt;
&lt;p&gt;Genome sequencing and corresponding gene/protein discovery vastly outpaces functional characteri-
zation, leaving much of protein space functionally dark. Generative protein to text models annotate
sequences with free text, but offer no reliability signal, and surface metrics cannot tell whether two
descriptions refer to the same molecular function. Here we present ProtTale, which couples sequence
to text generation with a built-in reliability head, and an LLM-as-judge protocol that scores functional
equivalence at the semantic level. On 1,031 unseen Swiss-Prot proteins held out at 40% identity, ProtTale
and four baselines reach similar accuracy but cover orthogonal slices, with ProtTale uniquely recovering
60 proteins missed by every other method. The reliability head raises ProtTale’s confident match rate
from 26.5% to 44.4% under a discrete filter and to 90% under a continuous score. By providing a per-
prediction reliability score, ProtTale enables users to selectively retain only trustworthy annotations,
making generative function annotation practically useful even when accuracy saturates.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Random Neighbor Score</title>
      <link>https://bromberglab.org/project/rns/</link>
      <pubDate>Mon, 08 Jun 2026 00:00:00 +0000</pubDate>
      <guid>https://bromberglab.org/project/rns/</guid>
      <description>&lt;h2 id=&#34;quantifying-uncertainty-in-protein-representations-across-models-and-tasks&#34;&gt;Quantifying uncertainty in protein representations across models and tasks&lt;/h2&gt;
&lt;p&gt;Biomolecular embeddings serve as efficient representations of sequence and structure, enabling tasks such as similarity searches, structure and function prediction and estimation of biophysical properties. However, relying on embeddings without assessing their ability to accurately represent biomolecules is a critical flaw—akin to using a scalpel in surgery without verifying its sharpness. Here we propose a means to evaluate the capacity of protein language models to encode biologically meaningful information. For each protein, representation uncertainty is scored as the fraction of non-biological ‘synthetic’ sequences among its nearest neighbors in latent space. Our analysis reveals that low-quality embeddings often fail to capture meaningful biology, displaying vector properties indistinguishable from those of randomly generated sequences. Our model-agnostic scoring framework is, to our knowledge, the first to quantify protein sequence embedding reliability. It enables embedding screening prior to downstream applications and inferences, significantly improving their reliability. We propose that embedding evaluation should be undertaken for other uses of language models in science as well.&lt;/p&gt;
</description>
    </item>
    
  </channel>
</rss>
