The Agent Skills Directory

[COMMAND_EXECUTION]: The skill uses the Bash tool to install necessary Python dependencies (requests, beautifulsoup4, scrapy, playwright, trafilatura) and to set up Playwright browsers using npx playwright install chromium. These operations are part of the standard setup for web scraping but represent a high-capability execution path.
[EXTERNAL_DOWNLOADS]: The skill fetches HTML content from user-provided URLs and communicates with the well-known openrouter.ai service for entity extraction tasks. It also downloads browser components from official registries.
[PROMPT_INJECTION]: The skill exhibits an attack surface for indirect prompt injection (Category 8) inherent to web scraping tools.
Ingestion points: Untrusted data from external websites is ingested during Stage 2 and Stage 3 of the pipeline (defined in SKILL.md).
Boundary markers: The LLM prompt template in Stage 5 (extract_entities_llm) interpolates the untrusted {text_sample} directly into the instructions without utilizing explicit delimiters or security boundaries to isolate the data from the prompt.
Capability inventory: The skill has access to powerful tools including Bash, Write, and Read, which could potentially be targeted if an attacker-controlled website successfully injects instructions into the scraped content.
Sanitization: While the skill performs basic text normalization for encoding and whitespace, it does not implement semantic filtering or sanitization to prevent instruction injection from the ingested web content.

web-scraper