sre-runbooks

Installation
SKILL.md

sre-runbooks

Purpose

This skill delivers standardized SRE runbooks for managing incidents and performing system maintenance in DevOps environments. It accesses pre-defined procedures to ensure consistent responses to issues like outages or upgrades.

When to Use

Use this skill during active incidents for quick reference, routine maintenance tasks, or when onboarding new SRE team members. Apply it in production environments facing downtime, scaling issues, or compliance checks to follow best practices.

Key Capabilities

  • Retrieve runbooks by ID, type, or keyword (e.g., "outage" or "database").
  • Execute automated steps from runbooks, such as running scripts or API calls.
  • Generate custom checklists based on runbook templates for specific environments.
  • Integrate with monitoring tools to trigger runbooks on alerts.
  • Support versioning of runbooks for tracking changes over time.

Usage Patterns

Invoke this skill via the sre-cli tool or API calls from your AI agent code. Always set the environment variable $SRE_API_KEY for authentication before use. For CLI, prefix commands with authentication checks. In code, import the skill as a module and call functions with required parameters. Example pattern: Check if the runbook exists first, then execute it in a try-catch block.

Related skills
Installs
24
GitHub Stars
5
First Seen
Mar 7, 2026