arxiv
Fail
Audited by Gen Agent Trust Hub on May 1, 2026
Risk Level: HIGHEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [EXTERNAL_DOWNLOADS]: The skill performs network requests to
export.arxiv.organdapi.semanticscholar.orgto retrieve paper metadata, abstracts, and citation counts. These are established academic services provided by reputable institutions. - [COMMAND_EXECUTION]: The skill relies on shell command execution for its primary functionality, specifically using
curlto interact with APIs andpython3for data processing. This includes both inline scripts provided inSKILL.mdand the standalonescripts/search_arxiv.pyscript. - [REMOTE_CODE_EXECUTION]: Automated scanners flagged several patterns where
curloutput is piped topython3. Technical review shows these are not typical RCE vulnerabilities where a remote server controls the executed code; rather, the skill usespython3 -corpython3 -m json.toolto execute hardcoded parsing logic defined locally within the skill. While the use of pipes into interpreters is a high-sensitivity pattern, the risk is mitigated by the trusted nature of the data sources and the local definition of the execution logic. - [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it ingests untrusted metadata (such as paper titles and abstracts) from external sources. Malicious actors could potentially publish papers with embedded instructions designed to influence the agent's behavior when the metadata is processed.
- Ingestion points: Metadata parsed from arXiv and Semantic Scholar API responses in
SKILL.mdandscripts/search_arxiv.py. - Boundary markers: No specific boundary markers or 'ignore' instructions are used to isolate untrusted external content.
- Capability inventory: The agent has access to shell commands (
curl,python3) and document processing tools (web_extract). - Sanitization: The metadata is parsed into structured formats (XML/JSON), but the resulting string content is not sanitized for potential prompt injection patterns before being presented to the agent.
Recommendations
- HIGH: Downloads and executes remote code from: https://export.arxiv.org/api/query?id_list=1706.03762, https://export.arxiv.org/api/query?search_query=all:GRPO+reinforcement+learning&max_results=5&sortBy=submittedDate&sortOrder=descending, https://api.semanticscholar.org/graph/v1/paper/arXiv:2402.03300?fields=title,authors,citationCount,referenceCount,influentialCitationCount,year,abstract - DO NOT USE without thorough review
Audit Metadata