data-engineering-ai-ml

Installation
SKILL.md

AI/ML Data Pipelines

Data engineering patterns for AI/ML workloads: embedding generation, vector databases, retrieval-augmented generation (RAG), LLM output monitoring, and batch inference. Covers LanceDB, pgvector, and OpenAI integrations.

When to Use These Patterns?

  • RAG Applications: Building chatbots, semantic search, question-answering
  • LLM Monitoring: Tracking token usage, latency, output quality
  • Embedding Pipelines: Generating and storing vector embeddings for ML models
  • Batch Inference: Large-scale model inference pipelines
  • Feature Stores: Versioned feature data for ML training/serving

Skill Dependencies

  • @data-engineering-core - Polars, DuckDB for data processing
  • @data-engineering-storage-remote-access - Cloud storage for embeddings and models
  • @data-engineering-orchestration - Schedule/batch embedding generation
  • @data-engineering-quality - Validate embedding quality
Related skills

More from legout/data-platform-agent-skills

Installs
6
First Seen
Feb 11, 2026