ml-research-engineer-safeguards

Installation
SKILL.md

ML / Research Engineer, Safeguards

When to Use

  • Define research questions on harm detection, jailbreak resistance, or policy categories
  • Curate or audit safety datasets — labeling guidelines, bias checks, version control
  • Train or fine-tune classifiers, rankers, or small LLM judges for moderation
  • Design benchmarks and eval suites — golden sets, adversarial slices, regression harnesses
  • Run ablations — architecture, threshold, data mix, ensemble vs single model
  • Analyze metrics — precision/recall, calibration, false positive/negative slices
  • Write research memos — methods, results, limitations, production recommendation
  • Specify promotion bar for a new safeguard model version

When NOT to Use

Installs
19
GitHub Stars
2
First Seen
May 20, 2026
ml-research-engineer-safeguards — daemon-blockint-tech/agentic-enteprises-skill