modal
Modal
Overview
Modal is a serverless Python compute platform. You define container images, functions, and resources in Python, then run them remotely with sub-second cold starts and per-second billing. There is no YAML, no Dockerfile, no Kubernetes; the container, GPUs, scaling, and storage are all configured through Python decorators.
Common reasons to reach for Modal:
- Run AI inference (open-weights or custom models) with GPUs on demand
- Fine-tune or train models without managing infrastructure
- Sandbox untrusted or LLM-generated code
- Schedule batch jobs or cron tasks
- Stand up HTTP endpoints (FastAPI, ASGI, WSGI, raw web servers) backed by GPUs
- Fan out embarrassingly parallel work via
.map()and.spawn()
Modal is Python-only. The modal SDK runs locally, ships your code to Modal's cloud, and executes it in containers you described declaratively.
Installation and Auth
More from maragudk/fabrik
diary
Write and maintain an implementation diary capturing what changed, why, what worked, what failed (with exact errors and commands), what was tricky, and how to review and validate. Activates proactively during non-trivial implementation work (new features, bug fixes, refactors, research spikes) and at natural session-end moments -- after a PR merges, a feature ships, or a work chunk wraps up -- to capture the narrative while it's still fresh. Does not activate for trivial tasks like one-line fixes, config tweaks, or quick questions.
3decisions
Guide for recording significant architectural and design decisions in docs/decisions.md. Use this skill when clearly significant architectural decisions are made (database choices, frameworks, core design patterns) or when explicitly asked to document a decision. Also suggest proactively at natural session-end moments -- after a PR merges, a feature ships, or a work chunk wraps up -- if a significant decision was made during the session and not yet recorded. Be conservative - only suggest for major decisions, not minor implementation details.
3unsloth
Guide for fine-tuning LLMs, embedding models, vision-language models, and TTS models efficiently with Unsloth. Covers LoRA/QLoRA SFT, reinforcement learning (GRPO, DPO, ORPO, KTO), embedding fine-tuning with sentence-transformers, continued pretraining, and saving/exporting to GGUF, Ollama, or vLLM. Use this skill whenever the user mentions Unsloth, FastLanguageModel, FastSentenceTransformer, FastVisionModel, FastModel, or wants memory-efficient fine-tuning of open LLMs or embedding models on a single GPU, even if they don't explicitly say "Unsloth".
2garden
Autonomous project gardening. Scans for maintenance issues (starting with documentation), picks one, fixes it in a worktree, self-reviews with competing agents, and opens a PR. Use when the user wants to tidy up the project, fix stale docs, or generally tend the codebase. Invoke with /garden.
2sql
Guide for working with SQL queries, in particular for SQLite. Use this skill when writing SQL queries, analyzing database schemas, designing migrations, or working with SQLite-related code.
2git
Guide for using git with specific preferences -- branch names without `feat/`/`hotfix/` prefixes, backticks around code identifiers in commit messages, asking about GitHub issues to reference before committing. Use this whenever you branch, commit, or write a commit message -- not just when explicitly asked to "commit". These conventions aren't in your default knowledge and you'll get them wrong without consulting this skill.
2