web-archive-scraper
Pass
Audited by Gen Agent Trust Hub on Mar 31, 2026
Risk Level: SAFEPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
- [PROMPT_INJECTION]: The skill processes untrusted data from the Internet Archive which presents an indirect prompt injection surface.
- Ingestion points: The
fetch_archived_contentfunction inscripts/search_archive.pyretrieves raw HTML from external URLs via the Internet Archive. - Boundary markers: The skill does not implement explicit delimiters or instructions to the agent to ignore embedded instructions when presenting the scraped content.
- Capability inventory: The script uses
requests.getto fetch remote content andrefor text processing. - Sanitization: The
extract_textfunction inscripts/search_archive.pyuses regular expressions to strip script, style, and HTML tags, providing basic text extraction. - [EXTERNAL_DOWNLOADS]: The skill requires the
requestsPython library for network communication. - [DATA_EXFILTRATION]: Performs network requests to
web.archive.org, a well-known service, to retrieve archived website metadata and content.
Audit Metadata