apastra-baseline
Apastra Baseline
Establish baselines from evaluation runs. A baseline is a snapshot of a scorecard that represents "known good" — future evaluations compare against it to detect regressions.
When to Use
Use this skill when you want to:
- Establish the first baseline after running an initial evaluation
- Update the baseline after a prompt improvement has been verified
- Roll back to a prior baseline
Establishing a Baseline
When asked to establish a baseline (e.g., "set the current results as the baseline for summarize-smoke"):
Step 1: Locate the Scorecard
Find the most recent run for the target suite in promptops/runs/. Look for the latest directory matching <suite-id>-* and read its scorecard.json.
More from bintzgavin/apastra
apastra
PromptOps skills for versioning, evaluating, and shipping AI prompts as disciplined software assets. Agent-as-harness — your IDE agent runs evals, compares baselines, and gates quality.
10apastra-validate
Validate all promptops files against JSON schemas. Catch formatting errors before running evaluations.
5apastra-scaffold
Generate new prompt specs, datasets, evaluators, and suites from templates. Creates correctly-formatted files that pass schema validation.
5apastra-eval
Run prompt evaluations using your IDE agent as the harness. Load suites, execute test cases, score results, and compare against baselines.
5apastra-getting-started
Quick setup guide for apastra PromptOps. Create your first prompt spec, dataset, evaluator, and suite in 5 minutes.
5apastra-setup-ci
Upgrade from local-first evaluation to automated GitHub Actions CI. Installs workflows for PR gating, release promotion, and auto-merge.
4