gcp-agent-eval-engine-runner

This skill provides the "engine" for your automated evaluation pipeline. Grounded in evaluation_blog.md, it handles the complexity of running hundreds of parallel requests against a shadow revision while capturing the full "Thinking Process" (Reasoning Trace).

Usage

Ask Antigravity to:

"Create an evaluation runner script for my agent"
"Implement parallel inference for my golden dataset"
"Capture SSE traces for tool trajectory evaluation"

Engine Pattern

Parallel Inference: Uses asyncio.Semaphore to throttle requests (preventing DDOS of the shadow service).
SSE Capture: Connects to the ADK POST /run_sse endpoint to stream intermediate events.
Dataset Enrichment: Appends response and intermediate_events to the input dataset.
Vertex AI Integration: Submits the enriched dataset to the create_evaluation_run API.

gcp-agent-eval-engine-runner

gcp-agent-eval-engine-runner

Usage

Engine Pattern

Python Boilerplate

More from googlecloudplatform/devrel-demos

go-backend-dev

go-reviewer

go-architect

go-test-expert

latest-software-version

go-project-setup