version-ml-data
Warn
Audited by Gen Agent Trust Hub on Mar 18, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONDATA_EXFILTRATIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill uses
subprocess.runin Python scripts (version_dataset.pyandtrain_with_mlflow.py) to interface with Git and DVC. It also relies ondvc repro, which triggers the execution of shell commands defined in thedvc.yamlconfiguration file. - [DATA_EXFILTRATION]: Configuration examples for remote storage (SSH) explicitly reference the user's private SSH key at
~/.ssh/id_rsa. This involves the agent interacting with sensitive credential files. - [EXTERNAL_DOWNLOADS]: The skill documentation suggests installing the
dvcpackage and its cloud provider extensions (e.g.,dvc[s3]) from the Python Package Index. These are established tools for MLOps. - [PROMPT_INJECTION]: The skill demonstrates a surface for indirect prompt injection by processing external data and parameter files that influence the behavior of subprocess calls. \n
- Ingestion points:
pd.read_csvintrain_with_mlflow.pyand parameter loading viadvc.api.params_show(). \n - Boundary markers: None present in the provided scripts. \n
- Capability inventory: Subprocess execution in
version_dataset.pyand arbitrary command execution viadvc.yamlstages. \n - Sanitization: No validation or sanitization is performed on input parameters or data paths before they are used in shell commands.
Audit Metadata