agent-eval
Pass
Audited by Gen Agent Trust Hub on Jun 13, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill executes a local shell script located at
scripts/agent-eval/audit.sh. This script is used to perform automated testing and takes several arguments including the tool version, repository URL, and test question. It also manages the local installation of the CodeGraph tool during the benchmarking process. - [EXTERNAL_DOWNLOADS]: The skill clones source code from public GitHub repositories (such as VS Code, Terraform, and Flask) to the local
/tmp/codegraph-corpusdirectory to provide the datasets for quality auditing. These repositories are well-known open-source projects. - [PROMPT_INJECTION]: The skill has an indirect prompt injection surface as it ingests and processes untrusted data from external repositories.
- Ingestion points: Target repositories are cloned from GitHub into the local environment and subsequently analyzed by the agent.
- Boundary markers: No explicit delimiters or boundary markers are defined in the instructions to separate repository content from the agent's instructions.
- Capability inventory: The skill allows for repository cloning, shell script execution (
audit.sh), and background job processing. - Sanitization: No sanitization or content filtering is performed on the files read from the cloned repositories.
Audit Metadata