finetuning

Pass

Audited by Gen Agent Trust Hub on Jun 18, 2026

Risk Level: SAFECOMMAND_EXECUTION
Full Analysis
  • Command Execution: The skill utilizes the Azure CLI (az) via the subprocess.run function in deploy_model.py to manage model deployments and retrieve access tokens. This is a standard and necessary pattern for automating Azure resource management within a development workflow.
  • Dynamic Code Execution: The calibrate_grader.py script employs exec and compile to load and execute local Python files provided as custom grading functions. This allows users to implement specialized evaluation logic for their reinforcement fine-tuning tasks. The script includes a clear security recommendation to only use trusted grader files.
  • Data Processing Surface: The skill processes user-provided datasets through scripts like evaluate_model.py and convert_dataset.py, which constitutes an indirect prompt injection surface.
  • Ingestion points: Training and test data are loaded from local JSONL files.
  • Boundary markers: The skill relies on structured JSONL schemas to define message roles (system, user, assistant).
  • Capability inventory: The environment supports subprocess execution for Azure CLI commands and local Python execution for custom graders.
  • Sanitization: The skill provides dedicated validation scripts (validate_sft.py, validate_dpo.py, validate_rft.py) to verify that input data matches the required structural formats before processing.
Audit Metadata
Risk Level
SAFE
Analyzed
Jun 18, 2026, 06:40 PM
Security Audit — agent-trust-hub — finetuning