grading-backend
Pass
Audited by Gen Agent Trust Hub on Apr 15, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill executes local shell scripts (
probes/probe-robustness.shandprobes/probe-performance.sh) and standard utilities likecurlandgrep. These tools are used to test the functionality, robustness, and performance of a backend service hosted locally (typically at http://localhost:8080). This execution occurs within the context of analyzing a development project. - [EXTERNAL_DOWNLOADS]: The skill utilizes
npxto download and run well-known developer tools such asautocannon(for load testing) andajv-cli(for JSON schema validation). These downloads target the official NPM registry, which is a well-known service and thus considered safe for this use case. - [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it reads and processes untrusted source code and documentation from the projects being graded.
- Ingestion points: Project source files (e.g., .ts, .py, .go), specification documentation, and tool outputs (e.g., curl responses, test logs) are read into the agent's context from the file system and local network.
- Boundary markers: The skill uses rigid markdown report templates and JSON schemas to structure the output, but it lacks explicit boundary delimiters or 'ignore' instructions for the third-party code being analyzed.
- Capability inventory: The agent has the capability to execute subprocesses via the provided shell scripts, perform file read/write operations, and probe local network services.
- Sanitization: No specific sanitization or filtering of external content is performed; the skill relies on the agent's internal reasoning and regex-based searching (grep) to process data.
Audit Metadata