exploring-llm-evaluations

Pass

Audited by Gen Agent Trust Hub on May 5, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [SAFE]: The skill utilizes a set of PostHog-specific tools (prefixed with posthog:) to perform administrative and analytical tasks. These operations, including HogQL execution via posthog:execute-sql and Hog DSL execution via posthog:evaluation-test-hog, are standard functionalities for the PostHog platform and do not involve unauthorized access or external exfiltration.
  • [PROMPT_INJECTION]: The skill contains a surface for indirect prompt injection as it is designed to ingest and analyze untrusted output from LLM generations through the llm-analytics-evaluation-summary-create tool. 1. Ingestion points: Untrusted data from $ai_generation events is processed when generating AI-powered summaries of evaluation patterns. 2. Boundary markers: None identified in the instructions. 3. Capability inventory: The skill possesses read capabilities via posthog:execute-sql and write/delete capabilities via the evaluation-* tool suite. 4. Sanitization: No specific sanitization or filtering of the generation content is described before it is processed by the summary tool. This risk is inherent to the skill's purpose and is managed by the platform environment.
Audit Metadata
Risk Level
SAFE
Analyzed
May 5, 2026, 03:33 PM