inference-server
Pass
Audited by Gen Agent Trust Hub on Apr 24, 2026
Risk Level: SAFECOMMAND_EXECUTION
Full Analysis
- Command Execution: The skill utilizes the
uv run inferencecommand to start the server and manage SLURM jobs. This is a standard practice for executing Python entrypoints in a controlled development environment. - Local Network Operations: The instructions include using
curlto test endpoints onlocalhost:8000. This is a routine procedure for verifying that a locally hosted service is responding correctly. - System Infrastructure Integration: The skill provides templates and commands for SLURM scheduling, which is common in high-performance computing environments for managing large-scale inference tasks.
- Dynamic Server Management: The server exposes custom endpoints such as
/update_weightsand/load_lora_adapter. These are functional features designed to allow hot-reloading of model components during active development or reinforcement learning workflows.
Audit Metadata