agent-eval

Pass

Audited by Gen Agent Trust Hub on Jun 13, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill executes a local shell script located at scripts/agent-eval/audit.sh. This script is used to perform automated testing and takes several arguments including the tool version, repository URL, and test question. It also manages the local installation of the CodeGraph tool during the benchmarking process.
  • [EXTERNAL_DOWNLOADS]: The skill clones source code from public GitHub repositories (such as VS Code, Terraform, and Flask) to the local /tmp/codegraph-corpus directory to provide the datasets for quality auditing. These repositories are well-known open-source projects.
  • [PROMPT_INJECTION]: The skill has an indirect prompt injection surface as it ingests and processes untrusted data from external repositories.
  • Ingestion points: Target repositories are cloned from GitHub into the local environment and subsequently analyzed by the agent.
  • Boundary markers: No explicit delimiters or boundary markers are defined in the instructions to separate repository content from the agent's instructions.
  • Capability inventory: The skill allows for repository cloning, shell script execution (audit.sh), and background job processing.
  • Sanitization: No sanitization or content filtering is performed on the files read from the cloned repositories.
Audit Metadata
Risk Level
SAFE
Analyzed
Jun 13, 2026, 09:51 AM
Security Audit — agent-trust-hub — agent-eval