llm-judge
Fail
Audited by Gen Agent Trust Hub on May 9, 2026
Risk Level: HIGHREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [REMOTE_CODE_EXECUTION]: As defined in
references/repo-agent.md, the skill is designed to automatically detect and run the test suites of evaluated repositories usingpytest,npm test,yarn test, orgo test. This pattern results in the execution of code provided by the repository being analyzed. If a repository is malicious, its build or test scripts could execute arbitrary commands on the system. - [COMMAND_EXECUTION]: The main workflow in
SKILL.mdand the supporting reference files execute several shell commands, includingcatfor reading specification files,gitfor repository metadata extraction, andpython3for JSON validation and data aggregation. These commands are executed based on user-provided file paths and repository locations. - [PROMPT_INJECTION]: The skill presents an indirect prompt injection surface. It reads the full content of external specification documents and target repositories, then passes this untrusted data into the prompts for Phase 1 and Phase 2 agents. Without explicit sanitization or instructions to ignore embedded commands, a malicious implementation could attempt to manipulate the judging process or sub-agent behavior.
- Ingestion points: User-provided specification files and the content of repositories under evaluation (SKILL.md, references/repo-agent.md).
- Boundary markers: Data is delimited by bold labels (e.g.,
**Spec Document:**or**Facts from all repos:**), but no explicit instructions are provided to the agents to disregard any conflicting instructions found within that data. - Capability inventory: The skill can read/write files, execute git commands, and run various language-specific test runners (
pytest,npm,go). - Sanitization: No sanitization or validation of the content of the specification or repository files is performed before they are interpolated into prompts.
Recommendations
- AI detected serious security threats
Audit Metadata