together-evaluations
Pass
Audited by Gen Agent Trust Hub on Mar 30, 2026
Risk Level: SAFE
Full Analysis
- [DATA_EXFILTRATION]: The skill transmits dataset content and user-provided external API tokens to Together AI's API endpoints (api.together.xyz) and user-specified external base URLs. This behavior is documented and essential for the skill's primary purpose of evaluating model responses, including those from external providers.
- [EXTERNAL_DOWNLOADS]: The provided scripts include functions to download evaluation result files in JSONL format from the Together AI service for local analysis.
- [COMMAND_EXECUTION]: The Python and TypeScript scripts facilitate the execution of evaluation jobs and automated polling for status updates via the Together AI platform.
- [PROMPT_INJECTION]: The skill processes user-supplied datasets and interpolates data into prompts using Jinja2 templates (e.g.,
{{prompt}}). While this presents a surface for indirect prompt injection, it is a standard and intended feature for an evaluation framework where dataset content is evaluated by a judge model.
Audit Metadata