audit-ml-pipeline

Warn

Audited by Gen Agent Trust Hub on Jun 13, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill relies on executing a bundled script, scripts/run_cells.py, which uses an in-process IPython shell to run arbitrary code generated by the agent. This execution pattern is triggered during the ML pipeline audit phase.
  • [REMOTE_CODE_EXECUTION]: The skill implements a dynamic execution pattern where the agent generates Python audit files (audit/*.py) from templates and then executes them at runtime. Although the agent is instructed to follow a 'read-only' contract, there is no technical enforcement within the runner to prevent the execution of malicious or unintended commands if the agent's logic is influenced by external inputs.
  • [INDIRECT_PROMPT_INJECTION]: The skill's primary function is to read and summarize 'skore reports'. The output of the audit scripts (the markdown digest) is subsequently consumed by the agent to update journals or decide next steps. If an external report contains malicious text designed to look like valid output or instructions, it could influence the agent's future behavior through this feedback loop. This risk is noted at a low severity level as the skill includes boundary markers and structured output formatting to mitigate accidental instruction obedience.
  • [DYNAMIC_EXECUTION]: The run_cells.py script utilizes IPython.core.interactiveshell.InteractiveShell.run_cell to execute strings as Python code. This facilitates a notebook-like execution environment within the agent's workflow, which is a powerful but sensitive capability that requires careful isolation.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Jun 13, 2026, 03:18 PM
Security Audit — agent-trust-hub — audit-ml-pipeline