A generalizable deep learning framework for structure-based protein–ligand affinity ranking

Benjamin P. Brown

doi:10.1073/pnas.2508998122

What is it about?

The problem is that of predicting the strength of interaction between a target protein and a candidate small molecule drug. It is desirable to be able to very rapidly score and rank interactions to identify the best compounds among billions. The scale of the problem is too large for high-quality physics-based approaches, and to date there remain challenges in using less expensive empirical methods. Machine learning (ML) methods have a lot of potential to improve early-stage drug discovery, but often they perform poorly prospectively. Here, we demonstrate an approach to improve ML model performance on new tasks for drug discovery.

Photo by Logan Voss on Unsplash

Why is it important?

In this work, we really prioritized generalizability over raw accuracy. High scores on standard benchmarks can be misleading, because they often reflect how well a model recognizes patterns it has already seen rather than how deeply it understands the underlying chemistry. To me, overall accuracy was less important than consistency and predictability, or how well the model behaves when confronted with new protein families or chemistries. This perspective shaped both the model design and the main evaluation strategy. Our results demonstrate that by constraining the model to learn only from a representation of chemical interactions , it maintains stable performance across unseen targets. This result provides a roadmap for the development of more accurate models that generalize effectively.

Perspectives

This manuscript is an exploration of learning spaces. A model's architecture defines the manifold on which learning occurs. Often we have an idea of what we want the model to learn, and it is easy to assume that the network will tend to learn the problem the way that we consider it. The challenge is that the model needs a massive amount of data to guide it to learning the problem how we want. The approach here was to use a task-specific architecture. Instead of guiding the model to focus on interactions, we restrict its learning space to them. Hopefully, this approach informs the design of increasingly accurate models for drug discovery that maintain generalizability.
Benjamin Brown
Vanderbilt University

This page is a summary of: A generalizable deep learning framework for structure-based protein–ligand affinity ranking, Proceedings of the National Academy of Sciences, October 2025, Proceedings of the National Academy of Sciences,
DOI: 10.1073/pnas.2508998122.
You can read the full text:

Read

Contributors

The following have contributed to this page

Benjamin Brown
Vanderbilt University

A deep learning approach to drug discovery built on our intuition of chemical interactions

What is it about?

Why is it important?

Perspectives

Contributors

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management

A deep learning approach to drug discovery built on our intuition of chemical interactions

What is it about?

Featured Image

Why is it important?

Perspectives

Read the Original

Contributors

Share this page:

Discover more

Medical Research

Life Sciences

Physical Sciences

Technology and Engineering

Environmental Research

Arts and Humanities

Social Sciences

Business and Management