amina-de-novo-protein-binder-design
De Novo Protein Binder Design Pipeline
Overview
This pipeline designs novel protein binders for a target protein structure through six phases:
- Target Preparation -- Clean and validate the input PDB
- Epitope Selection -- Identify binding sites via PeSTo + DBSCAN clustering
- Binder Length Optimization -- Calculate optimal binder size from epitope geometry
- Backbone Generation -- Generate candidate scaffolds with RFdiffusion
- Sequence Design -- Design sequences via inverse folding (ProteinMPNN / ESM-IF1)
- Validation -- Multi-metric screening with early termination
- Round Advancement -- Iterate or finalize based on results
Interaction Model
Phases 0--1.5 (preparation): Run autonomously -- clean the target, select epitopes, determine binder length. No user approval needed.
Before Phase 2 (backbone generation): Present a design plan to the user summarizing: target info, selected epitope and hotspot residues, proposed binder length, number of backbones, and any concerns (large target, polar epitope, etc.). Wait for user approval before proceeding.
More from aminoanalytica/amina-skills
pymol
Control PyMOL molecular visualization through Claude Code. Use when asked to "visualize protein", "render structure", "show cartoon", "color by chain", "ray trace", "set up pymol", "install pymol", or work with molecular graphics. Handles setup, visualization commands, and publication-quality figure generation.
40uniprot-database
Query and retrieve protein sequences, annotations, and functional data from UniProt. Supports text search, ID mapping between databases, batch downloads, and access to Swiss-Prot (reviewed) and TrEMBL (predicted) entries.
29rdkit
Python cheminformatics library for molecular manipulation and analysis. Parse SMILES/SDF/MOL formats, compute descriptors (MW, LogP, TPSA), generate fingerprints (Morgan, MACCS), perform substructure queries with SMARTS, create 2D/3D geometries, calculate similarity, and run chemical reactions.
28biorxiv-database
Search and retrieve preprints from bioRxiv. Use when asked to "search bioRxiv", "find preprints", "look up bioRxiv papers", or retrieve life sciences literature.
28scikit-bio
Python bioinformatics library for sequence manipulation, alignments, phylogenetics, diversity metrics (Shannon, UniFrac), ordination (PCoA, CCA), statistical tests (PERMANOVA, Mantel), and biological file format I/O.
28pdb-database
Query and retrieve protein/nucleic acid structures from RCSB PDB. Use when you need to search the PDB database for structures or metadata. Supports text, sequence, and structure-based searches, coordinate downloads, and metadata retrieval for structural biology workflows.
28