constitutional-ai

Originally fromovachiever/droid-tings
Installation
SKILL.md

Constitutional AI - Harmlessness from AI Feedback

Quick start

Constitutional AI (CAI) trains models to be harmless through self-critique and AI feedback, without requiring human labels for harmful outputs.

Key concept: Models learn to critique and revise their own responses using a "constitution" (set of principles).

Two phases:

  1. Supervised Learning (SL): Self-critique + revision
  2. Reinforcement Learning (RL): RLAIF (RL from AI Feedback)

Constitution example:

Principles:
1. Choose the response that is most helpful, honest, and harmless
2. Avoid responses that are toxic, racist, or sexist
3. Prefer responses that explain objections rather than refuse
4. Choose responses that are thoughtful and nuanced
Related skills

More from davila7/claude-code-templates

Installs
306
GitHub Stars
27.2K
First Seen
Jan 21, 2026