Dogfooding rfab-harness on 10 Challenging Cancer Targets
De novo VHH nanobody design against MPNST, DIPG, neuroblastoma, glioblastoma, and pancreatic cancer
February 2026 — Preprint v3.0
What this is
We built rfab-harness to make the RFAntibody pipeline accessible via a single YAML config and CLI command. To test whether it actually works in practice, we pointed it at 10 of the hardest cancer targets we could find—targets where no effective antibody therapy exists and patients face dismal prognoses.
This page summarizes the results: 4,085 designs scored across 10 campaigns, 135 candidates that passed stringent quality filters, and a 60-fold variation in design difficulty across targets.
By the Numbers
What We Ran
Each campaign used the three-stage RFAntibody pipeline, orchestrated by rfab-harness:
- RFdiffusion — SE3-equivariant diffusion model generates backbone scaffolds conditioned on the target epitope (817 backbones total, 53–118 per campaign).
- ProteinMPNN — Graph neural network designs 5 amino acid sequences per backbone at temperature 0.2 (4,085 sequences total).
- RoseTTAFold2 — Structure prediction scores each design in complex with the target (10 recycling iterations per design).
All 10 campaigns ran in parallel on NVIDIA A100-80GB GPUs via Modal cloud infrastructure. One YAML per target, one CLI command to launch.
The 10 Targets
We screened 43 target–indication pairs across five cancers with devastating unmet need, then selected 10 structurally actionable targets for de novo VHH nanobody design:
| # | Target | Indication(s) | PDB | Backbones | Designs | Passed | Rate |
|---|---|---|---|---|---|---|---|
| 1 | B7-H3 | GBM, MPNST, DIPG, NB | 9LME | 82 | 410 | 11 | 2.7% |
| 2 | CD47 | GBM | 5IWL | 96 | 480 | 19 | 4.0% |
| 3 | CEACAM5 | PDAC | 8BW0 | 82 | 410 | 81 | 19.8% |
| 4 | EGFR | GBM, MPNST, PDAC | 1YY9 | 64 | 320 | 5 | 1.6% |
| 5 | EGFRvIII | GBM | 8UKV | 73 | 365 | 1 | 0.3% |
| 6 | EphA2 | GBM | 3SKJ | 70 | 350 | 3 | 0.9% |
| 7 | GPC2 | Neuroblastoma | 6WJL | 53 | 265 | 2 | 0.8% |
| 8 | HER2 Dom. IV | GBM, MPNST | 1N8Z | 72 | 360 | 1 | 0.3% |
| 9 | MSLN (N-term) | PDAC | 4F3F | 107 | 535 | 6 | 1.1% |
| 10 | MSLN (C-term) | PDAC | 7U8C | 118 | 590 | 6 | 1.0% |
| Total | 817 | 4,085 | 135 | 3.3% | |||
Per-Target Pass Rates
Filter pass rates varied more than 60-fold across the 10 campaigns, revealing substantial differences in computational design difficulty among cancer targets:
The targets clustered into three difficulty tiers:
- High tractability: CEACAM5 (19.8%) — alone contributed 60% of all passing designs
- Moderate: CD47 (4.0%), B7-H3 (2.7%), EGFR (1.6%), MSLN-N (1.1%), MSLN-C (1.0%)
- Low tractability: EphA2 (0.9%), GPC2 (0.8%), EGFRvIII (0.3%), HER2 Dim. IV (0.3%)
Score Distributions
The joint distribution of pAE (interface confidence) and CDR RMSD (backbone accuracy) across all 4,085 designs shows where the passing candidates live:
Key finding: CDR RMSD is the bottleneck
CDR RMSD—not pAE—is the dominant determinant of design success. Median CDR RMSDs of 8–21 Å across campaigns indicate that most designs produce CDR loops that are structurally reasonable in isolation (framework-aligned H3 RMSD medians of 1.0–1.6 Å) but incorrectly positioned relative to the target antigen. The RFdiffusion backbone generation step is the rate-limiting stage.
Structural Quality Across Campaigns
Top Candidates
The three highest-confidence designs—immediate priorities for experimental validation:
What We Learned
Design difficulty is wildly target-dependent
The 60-fold range in pass rates (0.3% to 19.8%) substantially exceeds what sampling noise could explain. CEACAM5's concave epitope surface and extended H3 loop lengths (10–18 residues) drove its success. EGFRvIII's deletion junction neoepitope—a geometrically novel surface with limited structural precedent—drove its near-complete failure.
Model bias vs. intrinsic difficulty
Several lines of evidence favor the design-difficulty interpretation over model bias. Global fold quality (pLDDT) was uniform across all targets (medians 0.88–0.91). Template resolution did not predict success: CEACAM5 (19.8% pass) used a 3.1 Å template while EGFRvIII (0.3%) used a 2.9 Å template. The best-performing targets share concave epitope surfaces; the worst present convex or novel geometries.
Backbone topology is king
Passing designs concentrated on a small fraction of backbones. For B7-H3, 11 passing designs came from only 3 of 82 backbones (3.7%). For MSLN-C, 4 of 6 hits came from a single backbone. This suggests scaling backbone count—not sequences per backbone—is the most effective strategy for improving pass rates.
Cross-indication efficiency
B7-H3 is overexpressed across four of five indications studied (MPNST, DIPG, neuroblastoma, GBM). A single high-affinity VHH against B7-H3 could serve as a therapeutic or diagnostic agent across all four—approximately 15,000 patients annually. The 12 combined mesothelin candidates targeting two non-overlapping epitopes enable bispecific VHH construction for enhanced PDAC therapy.
Limitations
The backbone counts (53–118 per campaign) are ~100× smaller than production-scale runs. Computational metrics are predictive but not definitive—experimental validation is essential. The absence of ΔΔG scoring means binding energy is not directly assessed. These are sanity-check-scale campaigns; scale-up runs with 10,000+ backbones are planned for the most promising targets.
Everything is open source
Full v3 Preprint (PDF) | GitHub: Code, Configs & Results | rfab-harness Homepage
All 10 campaign configurations, analysis scripts, score CSVs, and designed sequences are available for the computational biology and cancer immunotherapy communities.