web-scraping
Fail
Audited by Gen Agent Trust Hub on May 9, 2026
Risk Level: HIGHPROMPT_INJECTIONEXTERNAL_DOWNLOADSCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: Use of homoglyphs in code structures.
- Evidence: The class name 'TrafilaturaСscraper' and its references in 'SKILL.md' contain a Cyrillic 'С' (U+0421) instead of the Latin 'C' (U+0043). This technique is a common form of obfuscation used to bypass text-based security filters.
- [PROMPT_INJECTION]: Potential for indirect prompt injection via untrusted web content.
- Ingestion points: The skill fetches data from arbitrary external URLs using libraries such as 'trafilatura', 'requests', and 'playwright'.
- Boundary markers: There are no explicit delimiters or instructions provided to the agent to treat the scraped content as untrusted or to ignore embedded instructions.
- Capability inventory: The skill has access to network requests and browser automation tools which could be exploited if malicious content is processed.
- Sanitization: No validation or sanitization of the scraped content is performed before it enters the agent's context.
- [EXTERNAL_DOWNLOADS]: Interaction with external services and platforms.
- Fetches data and media from platforms including YouTube, Instagram, and TikTok using established third-party libraries like 'yt-dlp' and 'instaloader'.
- [COMMAND_EXECUTION]: Execution of browser automation and media extraction tools.
- Launches browser instances via Playwright to handle dynamic content rendering.
- Executes 'yt-dlp' and 'instaloader' functionalities to process media and metadata from social platforms.
Recommendations
- AI detected serious security threats
Audit Metadata