Unified sampling framework and experimental benchmarking of sequence- and structure-based protein models

Aviv Spinner, Pascal Notin, Sam Berry, Dana Cortade, Zach Sisson, Svetlana Ikonomova, David Ross, Debora Marks

May, 2026

Abstract

Generative models are increasingly used for protein design, but the lack of standardized evaluation frameworks limits comparison across model classes and hinders translation to experimental success. We developed a unified framework for sequence generation and benchmarking across multiple model types, testing it on Tobacco etch virus (TEV) protease. Our experimental work revealed substantial performance variations, with machine learning-designed libraries achieving higher hit rates than conventional methods. Structure-based models demonstrated superior outcomes overall, and commonly used selection metrics do not strongly correlate with experimental activity, underscoring the importance of experimental validation in protein model development.

Type

Preprint

Publication

In bioRxiv

Unified sampling framework and experimental benchmarking of sequence- and structure-based protein models

Abstract

Sam Berry

Postdoctoral Fellow