gcp-agent-eval-engine-runner
gcp-agent-eval-engine-runner
This skill provides the "engine" for your automated evaluation pipeline. Grounded in evaluation_blog.md, it handles the complexity of running hundreds of parallel requests against a shadow revision while capturing the full "Thinking Process" (Reasoning Trace).
Usage
Ask Antigravity to:
- "Create an evaluation runner script for my agent"
- "Implement parallel inference for my golden dataset"
- "Capture SSE traces for tool trajectory evaluation"
Engine Pattern
- Parallel Inference: Uses
asyncio.Semaphoreto throttle requests (preventing DDOS of the shadow service). - SSE Capture: Connects to the ADK
POST /run_sseendpoint to stream intermediate events. - Dataset Enrichment: Appends
responseandintermediate_eventsto the input dataset. - Vertex AI Integration: Submits the enriched dataset to the
create_evaluation_runAPI.
Python Boilerplate
More from googlecloudplatform/devrel-demos
go-backend-dev
Specialist in implementing robust HTTP services and APIs in Go. Activates for "endpoint", "handler", "API", "server".
41go-reviewer
Expert code reviewer focusing on idiomatic Go, concurrency safety, and clean code principles. Activates for "review", "idiomatic", "refactor".
41go-architect
Expert in Go project scaffolding, standard layout compliance, and dependency management. Activates for "new project", "structure", "layout".
36go-test-expert
Expert in Go testing patterns, table-driven tests, httptest, benchmarking, and fuzzing. Activates for "test", "fail", "benchmark", "debug", "fuzz".
35latest-software-version
>
34go-project-setup
>
26