sre-engineer
SRE practices for defining SLOs, managing error budgets, automating toil, and building resilient production systems.
- Defines quantitative SLOs with SLI measurements, calculates error budgets, and enforces burn-rate policies to balance reliability with feature velocity
- Provides golden signal monitoring (latency, traffic, errors, saturation) with multiwindow burn-rate alerting rules and PromQL query templates
- Includes automation patterns for toil reduction, chaos engineering test design, and incident response runbooks with blameless postmortem guidance
- Delivers concrete Prometheus configurations, Python remediation scripts, and capacity planning workflows ready for production deployment
SRE Engineer
Core Workflow
- Assess reliability - Review architecture, SLOs, incidents, toil levels
- Define SLOs - Identify meaningful SLIs and set appropriate targets
- Verify alignment - Confirm SLO targets reflect user expectations before proceeding
- Implement monitoring - Build golden signal dashboards and alerting
- Automate toil - Identify repetitive tasks and build automation
- Test resilience - Design and execute chaos experiments; verify recovery meets RTO/RPO targets before marking the experiment complete; validate recovery behavior end-to-end
Reference Guide
Load detailed guidance based on context:
| Topic | Reference | Load When |
|---|---|---|
| SLO/SLI | references/slo-sli-management.md |
Defining SLOs, calculating error budgets |
| Error Budgets | references/error-budget-policy.md |
Managing budgets, burn rates, policies |
More from jeffallan/claude-skills
laravel-specialist
Build and configure Laravel 10+ applications, including creating Eloquent models and relationships, implementing Sanctum authentication, configuring Horizon queues, designing RESTful APIs with API resources, and building reactive interfaces with Livewire. Use when creating Laravel models, setting up queue workers, implementing Sanctum auth flows, building Livewire components, optimising Eloquent queries, or writing Pest/PHPUnit tests for Laravel features.
13.0Kgolang-pro
Implements concurrent Go patterns using goroutines and channels, designs and builds microservices with gRPC or REST, optimizes Go application performance with pprof, and enforces idiomatic Go with generics, interfaces, and robust error handling. Use when building Go applications requiring concurrent programming, microservices architecture, or high-performance systems. Invoke for goroutines, channels, Go generics, gRPC integration, CLI tools, benchmarks, or table-driven testing.
12.1Kflutter-expert
Use when building cross-platform applications with Flutter 3+ and Dart. Invoke for widget development, Riverpod/Bloc state management, GoRouter navigation, platform-specific implementations, performance optimization.
10.6Kkubernetes-specialist
Use when deploying or managing Kubernetes workloads. Invoke to create deployment manifests, configure pod security policies, set up service accounts, define network isolation rules, debug pod crashes, analyze resource limits, inspect container logs, or right-size workloads. Use for Helm charts, RBAC policies, NetworkPolicies, storage configuration, performance optimization, GitOps pipelines, and multi-cluster management.
9.1Kphp-pro
Use when building PHP applications with modern PHP 8.3+ features, Laravel, or Symfony frameworks. Invokes strict typing, PHPStan level 9, async patterns with Swoole, and PSR standards. Creates controllers, configures middleware, generates migrations, writes PHPUnit/Pest tests, defines typed DTOs and value objects, sets up dependency injection, and scaffolds REST/GraphQL APIs. Use when working with Eloquent, Doctrine, Composer, Psalm, ReactPHP, or any PHP API development.
8.9Kspring-boot-engineer
Generates Spring Boot 3.x configurations, creates REST controllers, implements Spring Security 6 authentication flows, sets up Spring Data JPA repositories, and configures reactive WebFlux endpoints. Use when building Spring Boot 3.x applications, microservices, or reactive Java applications; invoke for Spring Data JPA, Spring Security 6, WebFlux, Spring Cloud integration, Java REST API design, or Microservices Java architecture.
5.6K