skills/fikriaf/agentos/arxiv/Gen Agent Trust Hub

arxiv

Fail

Audited by Gen Agent Trust Hub on May 1, 2026

Risk Level: HIGHEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill performs network requests to export.arxiv.org and api.semanticscholar.org to retrieve paper metadata, abstracts, and citation counts. These are established academic services provided by reputable institutions.
  • [COMMAND_EXECUTION]: The skill relies on shell command execution for its primary functionality, specifically using curl to interact with APIs and python3 for data processing. This includes both inline scripts provided in SKILL.md and the standalone scripts/search_arxiv.py script.
  • [REMOTE_CODE_EXECUTION]: Automated scanners flagged several patterns where curl output is piped to python3. Technical review shows these are not typical RCE vulnerabilities where a remote server controls the executed code; rather, the skill uses python3 -c or python3 -m json.tool to execute hardcoded parsing logic defined locally within the skill. While the use of pipes into interpreters is a high-sensitivity pattern, the risk is mitigated by the trusted nature of the data sources and the local definition of the execution logic.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it ingests untrusted metadata (such as paper titles and abstracts) from external sources. Malicious actors could potentially publish papers with embedded instructions designed to influence the agent's behavior when the metadata is processed.
  • Ingestion points: Metadata parsed from arXiv and Semantic Scholar API responses in SKILL.md and scripts/search_arxiv.py.
  • Boundary markers: No specific boundary markers or 'ignore' instructions are used to isolate untrusted external content.
  • Capability inventory: The agent has access to shell commands (curl, python3) and document processing tools (web_extract).
  • Sanitization: The metadata is parsed into structured formats (XML/JSON), but the resulting string content is not sanitized for potential prompt injection patterns before being presented to the agent.
Recommendations
  • HIGH: Downloads and executes remote code from: https://export.arxiv.org/api/query?id_list=1706.03762, https://export.arxiv.org/api/query?search_query=all:GRPO+reinforcement+learning&max_results=5&sortBy=submittedDate&sortOrder=descending, https://api.semanticscholar.org/graph/v1/paper/arXiv:2402.03300?fields=title,authors,citationCount,referenceCount,influentialCitationCount,year,abstract - DO NOT USE without thorough review
Audit Metadata
Risk Level
HIGH
Analyzed
May 1, 2026, 09:02 AM