loki-mode
Fail
Audited by Gen Agent Trust Hub on May 27, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONCREDENTIALS_UNSAFE
Full Analysis
- [PROMPT_INJECTION]: The skill contains explicit instructions to override the agent's default safety behavior and confirmation gates. Specifically, the 'Core Autonomy Rules' in SKILL.md instruct the agent to 'NEVER ask questions', 'NEVER wait for confirmation', and 'NEVER stop voluntarily'. It also references a 'Ralph Wiggum Mode' intended for perpetual autonomous operation.
- [COMMAND_EXECUTION]: The skill requires the user to launch the agent with the
--dangerously-skip-permissionsflag. This bypasses the platform's primary security control, allowing the agent to execute any shell command (via the Bash tool) without user approval. Whileautonomy/run.shcontains aBLOCKED_COMMANDSlist, such filters are easily bypassed by an LLM. - [EXTERNAL_DOWNLOADS]: The benchmark and setup scripts perform several external operations:
- Fetches the HumanEval dataset from OpenAI's GitHub repository (
github.com/openai/human-eval). - Downloads the SWE-bench dataset from HuggingFace (
princeton-nlp/SWE-bench_Lite). - Installs multiple NPM and PyPI packages during the autonomous build process.
- [REMOTE_CODE_EXECUTION]: The skill is designed to autonomously download, install, and execute third-party dependencies and generated code. In
benchmarks/run-benchmarks.sh, it usespip installto installswebenchanddatasetsat runtime. The core RARV cycle involves generating code and immediately running it via shell commands. - [DATA_EXFILTRATION]: The skill is designed to handle sensitive cloud provider credentials (AWS, GCP, Azure) for its deployment phase. Combined with its autonomous network capabilities (
curl,fetch) and the instruction to skip user review, this creates a high risk of credential harvesting or data exfiltration if the agent is compromised via indirect injection. - [DYNAMIC_EXECUTION]: Shipped code in
benchmarks/results/humaneval-loki-solutions/160.pyand other result files containseval()calls used to process dynamically generated strings. Additionally,autonomy/run.shuses a complex inline Python script to parse and execute logic based on the agent's real-time JSON stream.
Recommendations
- AI detected serious security threats
Audit Metadata