production-observability

Installation
SKILL.md

Production Observability

Overview

Use this skill to make observability a required part of production system design, implementation, review, and incident hardening. Prefer an actionable observability contract over vague advice: identify what telemetry proves where work is, what failed, who is affected, what is safe to show users, what operators can inspect, and whether a fix worked.

Operating Principle

Do not treat telemetry as an after-the-fact patch for surprises. For any multi-step production workflow, first make the workflow observable enough that a future failure is diagnosable without guessing.

If the user is asking for a code change, design review, debugging help, or incident response and observability is missing, explicitly call out the gap before or alongside the implementation advice. Make reasonable assumptions rather than blocking on questions unless a missing detail materially changes safety, cost, or correctness.

Workflow

  1. Choose the operating mode.
    • New feature or system design: produce an observability contract before implementation details.
    • Implementation or PR review: identify missing instrumentation, risky gaps, and minimal changes required before merge.
    • Incident/debugging: identify the fastest diagnostic path with current telemetry, then specify what must be added to prevent repeat ambiguity.
    • Postmortem/hardening: turn detection gaps into concrete telemetry, alerts, runbooks, and prevention tasks.
Related skills

More from jarmen423/skills

Installs
1
GitHub Stars
2
First Seen
2 days ago