← Back to Home

Dogfooding rfab-harness on 10 Challenging Cancer Targets

De novo VHH nanobody design against MPNST, DIPG, neuroblastoma, glioblastoma, and pancreatic cancer

February 2026 — Preprint v3.0

📄 Read the Full v3 Preprint (PDF)    💻 GitHub Repo

What this is

We built rfab-harness to make the RFAntibody pipeline accessible via a single YAML config and CLI command. To test whether it actually works in practice, we pointed it at 10 of the hardest cancer targets we could find—targets where no effective antibody therapy exists and patients face dismal prognoses.

This page summarizes the results: 4,085 designs scored across 10 campaigns, 135 candidates that passed stringent quality filters, and a 60-fold variation in design difficulty across targets.

By the Numbers

10
Cancer targets
5
Indications
4,085
Designs scored
135
Passed filters
3.3%
Global pass rate
~75 min
Wall-clock time (10 A100 GPUs)
60×
Range in per-target pass rates

What We Ran

Each campaign used the three-stage RFAntibody pipeline, orchestrated by rfab-harness:

  1. RFdiffusion — SE3-equivariant diffusion model generates backbone scaffolds conditioned on the target epitope (817 backbones total, 53–118 per campaign).
  2. ProteinMPNN — Graph neural network designs 5 amino acid sequences per backbone at temperature 0.2 (4,085 sequences total).
  3. RoseTTAFold2 — Structure prediction scores each design in complex with the target (10 recycling iterations per design).

All 10 campaigns ran in parallel on NVIDIA A100-80GB GPUs via Modal cloud infrastructure. One YAML per target, one CLI command to launch.

Design funnel showing attrition from 817 backbones to 4,085 sequences to 135 passing designs
Figure 1. Design funnel. 817 backbone scaffolds expanded to 4,085 sequenced designs, of which 135 (3.3%) passed dual-threshold filtering (pAE < 10, CDR RMSD < 2.0 Å).

The 10 Targets

We screened 43 target–indication pairs across five cancers with devastating unmet need, then selected 10 structurally actionable targets for de novo VHH nanobody design:

# Target Indication(s) PDB Backbones Designs Passed Rate
1B7-H3GBM, MPNST, DIPG, NB9LME82410112.7%
2CD47GBM5IWL96480194.0%
3CEACAM5PDAC8BW0824108119.8%
4EGFRGBM, MPNST, PDAC1YY96432051.6%
5EGFRvIIIGBM8UKV7336510.3%
6EphA2GBM3SKJ7035030.9%
7GPC2Neuroblastoma6WJL5326520.8%
8HER2 Dom. IVGBM, MPNST1N8Z7236010.3%
9MSLN (N-term)PDAC4F3F10753561.1%
10MSLN (C-term)PDAC7U8C11859061.0%
Total 817 4,085 135 3.3%

Per-Target Pass Rates

Filter pass rates varied more than 60-fold across the 10 campaigns, revealing substantial differences in computational design difficulty among cancer targets:

Bar chart of filter pass rates by target
Figure 2. Filter pass rates by target. CEACAM5 dominates with 81 passing designs (19.8%), while EGFRvIII and HER2 Domain IV each produced only a single passing candidate.

The targets clustered into three difficulty tiers:

Score Distributions

The joint distribution of pAE (interface confidence) and CDR RMSD (backbone accuracy) across all 4,085 designs shows where the passing candidates live:

Scatter plot of pAE vs CDR RMSD for all designs
Figure 3. pAE vs. CDR RMSD for all 4,085 designs, colored by campaign. The shaded lower-left quadrant marks the 135 designs passing both filters. The dense cluster at low pAE / high RMSD represents antibodies correctly positioned at the interface but with CDR loops in the wrong conformation.

Key finding: CDR RMSD is the bottleneck

CDR RMSD—not pAE—is the dominant determinant of design success. Median CDR RMSDs of 8–21 Å across campaigns indicate that most designs produce CDR loops that are structurally reasonable in isolation (framework-aligned H3 RMSD medians of 1.0–1.6 Å) but incorrectly positioned relative to the target antigen. The RFdiffusion backbone generation step is the rate-limiting stage.

Structural Quality Across Campaigns

Heatmap of key structural metrics across all 10 campaigns
Figure 4. Key structural metrics across all 10 campaigns. pLDDT (global fold confidence) is uniformly high (0.88–0.91), confirming the VHH framework is robustly recapitulated regardless of target. The variation is concentrated in interface metrics (pAE) and CDR positioning (RMSD).
Violin plots of CDR-H3 loop RMSD
Figure 5. Framework-aligned CDR-H3 loop RMSD. The consistently low H3 RMSDs (medians 1.0–1.6 Å) show that CDR-H3 loop geometry is well-preserved by ProteinMPNN—the design challenge lies in positioning these loops correctly on the target, not in their intrinsic shape.

Top Candidates

Table of best-scoring candidate per target
Figure 6. Best-scoring candidate per target, ranked by composite score. The strongest candidates (CEACAM5, B7-H3, MSLN-N) achieve pAE below 3.0 Å and CDR RMSD below 1.5 Å.

The three highest-confidence designs—immediate priorities for experimental validation:

CEACAM5
pAE 2.59, RMSD 0.68 Å
MSLN-N
pAE 2.19, RMSD 1.33 Å
B7-H3
pAE 2.20, RMSD 1.10 Å

What We Learned

Design difficulty is wildly target-dependent

The 60-fold range in pass rates (0.3% to 19.8%) substantially exceeds what sampling noise could explain. CEACAM5's concave epitope surface and extended H3 loop lengths (10–18 residues) drove its success. EGFRvIII's deletion junction neoepitope—a geometrically novel surface with limited structural precedent—drove its near-complete failure.

Model bias vs. intrinsic difficulty

Several lines of evidence favor the design-difficulty interpretation over model bias. Global fold quality (pLDDT) was uniform across all targets (medians 0.88–0.91). Template resolution did not predict success: CEACAM5 (19.8% pass) used a 3.1 Å template while EGFRvIII (0.3%) used a 2.9 Å template. The best-performing targets share concave epitope surfaces; the worst present convex or novel geometries.

Backbone topology is king

Passing designs concentrated on a small fraction of backbones. For B7-H3, 11 passing designs came from only 3 of 82 backbones (3.7%). For MSLN-C, 4 of 6 hits came from a single backbone. This suggests scaling backbone count—not sequences per backbone—is the most effective strategy for improving pass rates.

Cross-indication efficiency

B7-H3 is overexpressed across four of five indications studied (MPNST, DIPG, neuroblastoma, GBM). A single high-affinity VHH against B7-H3 could serve as a therapeutic or diagnostic agent across all four—approximately 15,000 patients annually. The 12 combined mesothelin candidates targeting two non-overlapping epitopes enable bispecific VHH construction for enhanced PDAC therapy.

Limitations

The backbone counts (53–118 per campaign) are ~100× smaller than production-scale runs. Computational metrics are predictive but not definitive—experimental validation is essential. The absence of ΔΔG scoring means binding energy is not directly assessed. These are sanity-check-scale campaigns; scale-up runs with 10,000+ backbones are planned for the most promising targets.

Everything is open source

Full v3 Preprint (PDF)  |  GitHub: Code, Configs & Results  |  rfab-harness Homepage

All 10 campaign configurations, analysis scripts, score CSVs, and designed sequences are available for the computational biology and cancer immunotherapy communities.