version-ml-data

Warn

Audited by Gen Agent Trust Hub on Mar 18, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONDATA_EXFILTRATIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill uses subprocess.run in Python scripts (version_dataset.py and train_with_mlflow.py) to interface with Git and DVC. It also relies on dvc repro, which triggers the execution of shell commands defined in the dvc.yaml configuration file.
  • [DATA_EXFILTRATION]: Configuration examples for remote storage (SSH) explicitly reference the user's private SSH key at ~/.ssh/id_rsa. This involves the agent interacting with sensitive credential files.
  • [EXTERNAL_DOWNLOADS]: The skill documentation suggests installing the dvc package and its cloud provider extensions (e.g., dvc[s3]) from the Python Package Index. These are established tools for MLOps.
  • [PROMPT_INJECTION]: The skill demonstrates a surface for indirect prompt injection by processing external data and parameter files that influence the behavior of subprocess calls. \n
  • Ingestion points: pd.read_csv in train_with_mlflow.py and parameter loading via dvc.api.params_show(). \n
  • Boundary markers: None present in the provided scripts. \n
  • Capability inventory: Subprocess execution in version_dataset.py and arbitrary command execution via dvc.yaml stages. \n
  • Sanitization: No validation or sanitization is performed on input parameters or data paths before they are used in shell commands.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 18, 2026, 07:14 AM