bdistill-validate
Pass
Audited by Gen Agent Trust Hub on Apr 24, 2026
Risk Level: SAFE
Full Analysis
- [COMMAND_EXECUTION]: The skill utilizes a local Python script (
scripts/validate_engine.py) to extract numbers and compute stability scores. The agent is instructed to run this script using standard command-line invocations. The script code is provided within the skill and uses standard libraries for mathematical and text processing without any dangerous system calls. - [PROMPT_INJECTION]: The instructions direct the agent to 'disregard previous answers' and 'flush context' between tests. These patterns, which might typically suggest bypassing safety guardrails, are employed here as a necessary technical procedure to ensure that consistency checks are independent and not influenced by previous context (avoiding 'parrotting').
- [DATA_EXFILTRATION]: The skill reads from local Knowledge Base files (
data/knowledge/base/) and writes results to local JSON files (data/consistency/). There are no network requests, API calls, or transmissions of data to external servers found in the instructions or the accompanying script. - [PROMPT_INJECTION]: The skill exhibits an attack surface for indirect prompt injection as it processes external data from Knowledge Base files which are then used in command arguments and model prompts.
- Ingestion points: Processes
.jsonlfiles from the local filesystem. - Boundary markers: No specific delimiters or 'ignore' instructions are used when interpolating KB claims into prompts.
- Capability inventory: Capability includes local file read/write and script execution.
- Sanitization: The Python script sanitizes domain names used in file paths, though it relies on the execution environment's shell handling for the sanitization of claim text passed as CLI arguments. This surface is considered low risk given the local context and intended use.
Audit Metadata