exploring-llm-traces
Pass
Audited by Gen Agent Trust Hub on May 13, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill is a legitimate tool for LLM observability provided by PostHog. It facilitates the inspection of agent behavior through structured trace data.
- [PROMPT_INJECTION]: The skill represents a surface for indirect prompt injection by processing untrusted trace data (user and assistant messages).
- Ingestion points: Untrusted data enters the agent context via the posthog:query-llm-trace and posthog:query-llm-traces-list tools, which retrieve historical conversation data.
- Boundary markers: Analysis scripts (scripts/*.py) add labels such as [USER] or [ASSISTANT] to indicate message origins but do not include explicit 'ignore instructions' delimiters for the agent.
- Capability inventory: The agent can run local Python scripts and execute SQL queries against the PostHog backend.
- Sanitization: The provided Python scripts use standard JSON parsing and output truncation. While they do not explicitly sanitize content for prompt injection sequences, the analytical use case and localized execution of the scripts minimize the risk of the agent autonomously following malicious instructions embedded in the traces.
Audit Metadata