The Agent Skills Directory

[PROMPT_INJECTION]: Use of homoglyphs in code structures.
Evidence: The class name 'TrafilaturaСscraper' and its references in 'SKILL.md' contain a Cyrillic 'С' (U+0421) instead of the Latin 'C' (U+0043). This technique is a common form of obfuscation used to bypass text-based security filters.
[PROMPT_INJECTION]: Potential for indirect prompt injection via untrusted web content.
Ingestion points: The skill fetches data from arbitrary external URLs using libraries such as 'trafilatura', 'requests', and 'playwright'.
Boundary markers: There are no explicit delimiters or instructions provided to the agent to treat the scraped content as untrusted or to ignore embedded instructions.
Capability inventory: The skill has access to network requests and browser automation tools which could be exploited if malicious content is processed.
Sanitization: No validation or sanitization of the scraped content is performed before it enters the agent's context.
[EXTERNAL_DOWNLOADS]: Interaction with external services and platforms.
Fetches data and media from platforms including YouTube, Instagram, and TikTok using established third-party libraries like 'yt-dlp' and 'instaloader'.
[COMMAND_EXECUTION]: Execution of browser automation and media extraction tools.
Launches browser instances via Playwright to handle dynamic content rendering.
Executes 'yt-dlp' and 'instaloader' functionalities to process media and metadata from social platforms.

web-scraping