managing-deployment-rollbacks

Installation
SKILL.md

Managing Deployment Rollbacks

Overview

Implement and execute deployment rollback procedures for Kubernetes, ECS, Lambda, and cloud VM deployments. Detect failed deployments via health checks and error rate monitoring, then automatically or manually revert to the last known good version with minimal downtime and data integrity preservation.

Prerequisites

  • kubectl configured with cluster access and permission to manage deployments
  • Deployment history retained (Kubernetes revisionHistoryLimit, ECS task definition versions)
  • Monitoring system tracking error rate, latency, and health check status (Prometheus, Datadog, CloudWatch)
  • Previous deployment artifacts (container images, task definitions) still available in the registry
  • Database migration strategy that supports backward compatibility (expand-contract pattern)

Instructions

  1. Detect deployment failure: monitor error rate, P99 latency, pod restart count, and health check responses for 5-10 minutes post-deploy
  2. Assess rollback scope: determine if the issue is application code, configuration, or infrastructure
  3. For Kubernetes: execute kubectl rollout undo deployment/<name> to revert to the previous revision
  4. For ECS: update the service to use the previous task definition revision
Related skills
Installs
1
GitHub Stars
2.2K
First Seen
Mar 21, 2026