guidance
Guidance: Constrained LLM Generation
When to Use This Skill
Use Guidance when you need to:
- Control LLM output syntax with regex or grammars
- Guarantee valid JSON/XML/code generation
- Reduce latency vs traditional prompting approaches
- Enforce structured formats (dates, emails, IDs, etc.)
- Build multi-step workflows with Pythonic control flow
- Prevent invalid outputs through grammatical constraints
GitHub Stars: 18,000+ | From: Microsoft Research
Installation
# Base installation
pip install guidance
More from firecrawl/ai-research-skills
pinecone
Managed vector database for production AI applications. Fully managed, auto-scaling, with hybrid search (dense + sparse), metadata filtering, and namespaces. Low latency (<100ms p95). Use for production RAG, recommendation systems, or semantic search at scale. Best for serverless, managed infrastructure.
6prompt-guard
Meta's 86M prompt injection and jailbreak detector. Filters malicious prompts and third-party data for LLM apps. 99%+ TPR, <1% FPR. Fast (<2ms GPU). Multilingual (8 languages). Deploy with HuggingFace or batch processing for RAG security.
6chroma
Open-source embedding database for AI applications. Store embeddings and metadata, perform vector and full-text search, filter by metadata. Simple 4-function API. Scales from notebooks to production clusters. Use for semantic search, RAG applications, or document retrieval. Best for local development and open-source projects.
6unsloth
Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization
5slime-rl-training
Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling.
5mlflow
Track ML experiments, manage model registry with versioning, deploy models to production, and reproduce experiments with MLflow - framework-agnostic ML lifecycle platform
5