ai-moderating-content

Installation

SKILL.md

Auto-Moderate What Users Post

Guide the user through building AI content moderation — classify user-generated content, score severity, and route decisions (auto-approve, human-review, auto-reject). The pattern: classify, score, route.

When NOT to use AI moderation

Low-volume content — if a human can review everything in under an hour per day, skip AI. The complexity of maintaining a moderation pipeline is not worth it.
Exact-match violations only — if your policy is just a blocklist of words or regex patterns (SSNs, emails, phone numbers), use pattern matching directly. No LM needed.
Legal-grade decisions — AI moderation is a first pass, not a legal ruling. If a wrong moderation decision has legal consequences (DMCA takedowns, defamation claims), always route to human review.

Consider /ai-sorting instead if you just need classification without severity scoring or routing logic.

Step 1: Define your moderation policy

Ask the user:

What content do you need to catch? (hate speech, spam, NSFW, harassment, self-harm, illegal activity, PII)
What are the severity levels? (warning, remove, ban)
What is the tolerance for false positives? (over-moderating frustrates users)
Is human review in the loop? (auto-only vs. auto + human escalation)

Related skills

More from lebsral/dspy-programming-not-prompting-lms-skills

Installs

18

Repository

lebsral/dspy-pr…s-skills

GitHub Stars

5

First Seen

Feb 8, 2026

Security Audits

Gen Agent Trust HubPass