online-evals

Pass

Audited by Gen Agent Trust Hub on Jun 12, 2026

Risk Level: SAFEDATA_EXFILTRATIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [DATA_EXFILTRATION]: The skill describes procedures for locating API credentials within environment variables and the local Claude desktop configuration file (~/.claude/config.json). These credentials are used exclusively to authenticate requests to the official LaunchDarkly API endpoint (app.launchdarkly.com).
  • [COMMAND_EXECUTION]: Provides implementation details and examples for interacting with the LaunchDarkly REST API using Python's requests library and curl. These operations are restricted to managing AI configurations, variations, and evaluation settings.
  • [EXTERNAL_DOWNLOADS]: References external resources, including official documentation and code examples, hosted on LaunchDarkly's primary domain and GitHub organization repositories.
  • [PROMPT_INJECTION]: As the skill involves using LLMs to evaluate potentially untrusted data (responses generated by other models), it has an inherent surface for indirect prompt injection. However, it does not contain malicious instructions, and the risk is associated with the primary functional use-case of automated evaluation.
  • Ingestion points: Processes user input and model outputs via the SDK run() and evaluate() methods.
  • Boundary markers: No explicit sanitization of model outputs is shown in the provided code examples.
  • Capability inventory: Performs network requests to LaunchDarkly's API via the requests library in SKILL.md.
  • Sanitization: Relies on the user's implementation of the LLM-as-a-judge methodology.
Audit Metadata
Risk Level
SAFE
Analyzed
Jun 12, 2026, 08:05 PM
Security Audit — agent-trust-hub — online-evals