bedrock-agentcore-evaluations

Installation
SKILL.md

Amazon Bedrock AgentCore Evaluations

Overview

AgentCore Evaluations transforms agent testing from "vibes-based" to metric-based quality assurance. Test agents before production, then continuously monitor live interactions using 13 built-in evaluators and custom scoring systems.

Purpose: Ensure AI agents meet quality, safety, and effectiveness standards

Pattern: Task-based (5 operations)

Key Principles (validated by AWS December 2025):

  1. Pre-Production Testing - Validate before deployment
  2. Continuous Monitoring - Sample and score live interactions
  3. 13 Built-in Evaluators - Standard quality dimensions
  4. Custom Evaluators - LLM-as-Judge for domain-specific metrics
  5. Alerting Integration - CloudWatch for proactive monitoring
  6. On-Demand + Continuous - Both testing modes supported
Installs
33
GitHub Stars
11
First Seen
Jan 24, 2026
bedrock-agentcore-evaluations — adaptationio/skrillz