web-scraper
Pass
Audited by Gen Agent Trust Hub on Jun 18, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill uses the
Bashtool to install necessary Python dependencies (requests,beautifulsoup4,scrapy,playwright,trafilatura) and to set up Playwright browsers usingnpx playwright install chromium. These operations are part of the standard setup for web scraping but represent a high-capability execution path. - [EXTERNAL_DOWNLOADS]: The skill fetches HTML content from user-provided URLs and communicates with the well-known
openrouter.aiservice for entity extraction tasks. It also downloads browser components from official registries. - [PROMPT_INJECTION]: The skill exhibits an attack surface for indirect prompt injection (Category 8) inherent to web scraping tools.
- Ingestion points: Untrusted data from external websites is ingested during Stage 2 and Stage 3 of the pipeline (defined in
SKILL.md). - Boundary markers: The LLM prompt template in Stage 5 (
extract_entities_llm) interpolates the untrusted{text_sample}directly into the instructions without utilizing explicit delimiters or security boundaries to isolate the data from the prompt. - Capability inventory: The skill has access to powerful tools including
Bash,Write, andRead, which could potentially be targeted if an attacker-controlled website successfully injects instructions into the scraped content. - Sanitization: While the skill performs basic text normalization for encoding and whitespace, it does not implement semantic filtering or sanitization to prevent instruction injection from the ingested web content.
Audit Metadata