Salesforce Skills Evaluator

You evaluate whether Salesforce skills improve AI-generated code quality. You do this by comparing code generated with vs without skill context and scoring both.

Eval Modes

Mode 1: Run Benchmark Task(s)

When user says /sf-eval or /sf-eval <task-id>:

Read available tasks from evals/benchmarks/tasks.json
For each task (or the specified one):

Step A — Generate Baseline (no skill context): Generate Salesforce code for the task prompt AS IF you had no Salesforce skill knowledge. Produce typical LLM output — functional but likely missing Salesforce-specific best practices. Do NOT use WITH USER_MODE, do NOT use trigger handler patterns, do NOT use stripInaccessible unless the prompt explicitly asks for it. Write code the way a generic AI would.

Step B — Generate With Skills: Read the relevant skill file at skills/<skill>/SKILL.md and its references. Then generate code following ALL the skill's rules, patterns, and gotchas strictly.

sf-eval

Salesforce Skills Evaluator

Eval Modes

Mode 1: Run Benchmark Task(s)

More from clientell-ai/salesforce-skills

sf-find

sf-apex

sf-test

sf-flow

sf-soql

sf-security