ml-paper-writing
Draft publication-ready ML/AI/Systems papers for top conferences with citation verification and LaTeX templates.
- Supports 12 major venues (NeurIPS, ICML, ICLR, ACL, AAAI, COLM, OSDI, NSDI, ASPLOS, SOSP) with conference-specific templates, page limits, and submission requirements
- Enforces programmatic citation verification via Semantic Scholar and CrossRef APIs to prevent hallucinated references; marks unverifiable citations as explicit placeholders
- Provides writing philosophy from leading researchers (Nanda, Farquhar, Lipton, Steinhardt) covering narrative structure, the 7 principles of reader expectations, and sentence-level clarity
- Includes format conversion workflows for resubmitting between venues and complete paper checklists (abstract, introduction, methods, experiments, related work, limitations) with iterative feedback loops
ML Paper Writing for Top AI Conferences
Expert-level guidance for writing publication-ready papers targeting NeurIPS, ICML, ICLR, ACL, AAAI, COLM. This skill combines writing philosophy from top researchers (Nanda, Farquhar, Karpathy, Lipton, Steinhardt) with practical tools: LaTeX templates, citation verification APIs, and conference checklists.
For systems venues (OSDI, NSDI, ASPLOS, SOSP), use the systems-paper-writing skill, which provides paragraph-level structural blueprints, writing patterns, venue-specific checklists, and LaTeX templates for systems conferences.
Core Philosophy: Collaborative Writing
Paper writing is collaborative, but Claude should be proactive in delivering drafts.
The typical workflow starts with a research repository containing code, results, and experimental artifacts. Claude's role is to:
- Understand the project by exploring the repo, results, and existing documentation
- Deliver a complete first draft when confident about the contribution
- Search literature using web search and APIs to find relevant citations
- Refine through feedback cycles when the scientist provides input
- Ask for clarification only when genuinely uncertain about key decisions
Key Principle: Be proactive. If the repo and results are clear, deliver a full draft. Don't block waiting for feedback on every section—scientists are busy. Produce something concrete they can react to, then iterate based on their response.
More from zechenzhangagi/ai-research-skills
qdrant-vector-search
High-performance vector similarity search engine for RAG and semantic search. Use when building production RAG systems requiring fast nearest neighbor search, hybrid search with filtering, or scalable vector storage with Rust-powered performance.
74crewai-multi-agent
Multi-agent orchestration framework for autonomous AI collaboration. Use when building teams of specialized agents working together on complex tasks, when you need role-based agent collaboration with memory, or for production workflows requiring sequential/hierarchical execution. Built without LangChain dependencies for lean, fast execution.
74huggingface-tokenizers
Fast tokenizers optimized for research and production. Rust-based implementation tokenizes 1GB in <20 seconds. Supports BPE, WordPiece, and Unigram algorithms. Train custom vocabularies, track alignments, handle padding/truncation. Integrates seamlessly with transformers. Use when you need high-performance tokenization or custom tokenizer training.
73unsloth
Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization
72nemo-guardrails
NVIDIA's runtime safety framework for LLM applications. Features jailbreak detection, input/output validation, fact-checking, hallucination detection, PII filtering, toxicity detection. Uses Colang 2.0 DSL for programmable rails. Production-ready, runs on T4 GPU.
72optimizing-attention-flash
Optimizes transformer attention with Flash Attention for 2-4x speedup and 10-20x memory reduction. Use when training/running transformers with long sequences (>512 tokens), encountering GPU memory issues with attention, or need faster inference. Supports PyTorch native SDPA, flash-attn library, H100 FP8, and sliding window attention.
71