bio-protein-clustering-pangenome
Installation
SKILL.md
Bio Protein Clustering Pangenome
Cluster proteins into orthogroups and derive pangenome matrices.
Instructions
- Cluster proteins with MMseqs2 or ProteinOrtho.
- Build presence/absence matrix.
- Compute core/accessory/cloud/singleton partitions.
- Identify single-copy orthologs for phylogenetic analysis.
- Discriminate paralogs from orthologs in multi-copy gene families.
- Calculate pangenome statistics (completeness, orthogroup occupancy).
Quick Reference
| Task | Action |
|---|---|
| Run workflow | Follow the steps in this skill and capture outputs. |
| Validate inputs | Confirm required inputs and reference data exist. |
Related skills
More from fmschulz/omics-skills
beautiful-data-viz
Create publication-quality matplotlib/seaborn charts with readable axes, tight layout, and curated palettes.
19bio-phylogenomics
Build marker gene alignments and phylogenetic trees.
19plotly-dashboard-skill
Build production-ready Plotly Dash dashboards with consistent theming, clear layouts, and performant callbacks.
18bio-annotation
Functional annotation and taxonomy inference from sequence homology.
17bio-foundation-housekeeping
Initialize a bioinformatics project scaffold with reproducible environments, schemas, and data cataloging. Use for new projects or repo setup.
16bio-stats-ml-reporting
Aggregate results, train ML models, and produce reports with validated references.
16