evaluator
Pass
Audited by Gen Agent Trust Hub on Mar 29, 2026
Risk Level: SAFECOMMAND_EXECUTION
Full Analysis
- [COMMAND_EXECUTION]: The skill utilizes local shell commands to manage development servers and perform browser-based verification.
- Evidence: Executes startup commands such as
npm run dev,yarn dev,pnpm dev, andbun devto ensure the target application is reachable. - Evidence: Uses
curlto perform health checks on the local service athttp://localhost:3000. - Evidence: Employs
playwright-clito perform automated browser actions includingopen,snapshot,screenshot,click,fill, andconsolelogging. - [PROMPT_INJECTION]: The skill processes a 'Sprint Contract' that defines the test plan and interaction parameters, which represents a surface for indirect instructions.
- Ingestion points: Data is ingested from the 'Sprint Contract' block described in
SKILL.mdto dictate navigation and form interactions. - Boundary markers: Absent; the skill does not use specific delimiters to isolate the ingested contract instructions from the agent's core protocol.
- Capability inventory: High-privilege browser automation (clicks, form fills, navigation) and shell command execution.
- Sanitization: Input values from the contract are used directly for automation without explicit validation or escaping within the skill's logic.
Audit Metadata