evaluating-cosmos-policy
Cosmos Policy Evaluation
Evaluation workflows for NVIDIA Cosmos Policy on LIBERO and RoboCasa simulation environments from the public cosmos-policy repository. Covers blank-machine setup, headless GPU evaluation, and inference profiling.
Quick start
Run a minimal LIBERO evaluation using the official public eval module:
uv run --extra cu128 --group libero --python 3.10 \
python -m cosmos_policy.experiments.robot.libero.run_libero_eval \
--config cosmos_predict2_2b_480p_libero__inference_only \
--ckpt_path nvidia/Cosmos-Policy-LIBERO-Predict2-2B \
--config_file cosmos_policy/config/config.py \
--use_wrist_image True \
--use_proprio True \
--normalize_proprio True \
--unnormalize_actions True \
--dataset_stats_path nvidia/Cosmos-Policy-LIBERO-Predict2-2B/libero_dataset_statistics.json \
More from zechenzhangagi/ai-research-skills
ml-paper-writing
Write publication-ready ML/AI papers for NeurIPS, ICML, ICLR, ACL, AAAI, COLM. Use when drafting papers from research repos, structuring arguments, verifying citations, or preparing camera-ready submissions. For systems venues (OSDI, NSDI, ASPLOS, SOSP), use systems-paper-writing instead.
722crewai-multi-agent
Multi-agent orchestration framework for autonomous AI collaboration. Use when building teams of specialized agents working together on complex tasks, when you need role-based agent collaboration with memory, or for production workflows requiring sequential/hierarchical execution. Built without LangChain dependencies for lean, fast execution.
75qdrant-vector-search
High-performance vector similarity search engine for RAG and semantic search. Use when building production RAG systems requiring fast nearest neighbor search, hybrid search with filtering, or scalable vector storage with Rust-powered performance.
75huggingface-tokenizers
Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds. Supports BPE, WordPiece, and Unigram algorithms. Train custom vocabularies, track alignments, handle padding/truncation. Integrates seamlessly with transformers. Use when you need high-performance tokenization or custom tokenizer training.
74unsloth
Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization
74optimizing-attention-flash
Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory issues with attention, or need faster inference. Supports PyTorch native SDPA, flash-attn library, H100 FP8, and sliding window attention.
73