ml-systems-engineer-rl-engineering

Installation
SKILL.md

Machine Learning Systems Engineer, RL Engineering

When to Use

  • Design RL training platform — controllers, workers, resource scheduling
  • Implement rollout collection — vectorized envs, async actors, trajectory buffers
  • Operate distributed training — data parallel, parameter servers, gradient sync patterns
  • Manage replay buffers — prioritization, storage, sampling at scale
  • Wire checkpointing — policy/value nets, optimizer state, resume after preemption
  • Integrate experiment tracking — seeds, configs, metric schemas, artifact lineage
  • Connect simulators — Gymnasium-style APIs, custom env servers, batch stepping
  • Export policies for batch eval or downstream inference path
  • Debug training instability — NaNs, reward scale, worker desync, straggler GPUs
  • Plan GPU/memory layout for actor vs learner processes

When NOT to Use

Installs
18
GitHub Stars
2
First Seen
May 20, 2026
ml-systems-engineer-rl-engineering — daemon-blockint-tech/agentic-enteprises-skill