semantic-grep
Semantic Grep
jina-grep-style semantic search, done in-process via Python rather than as an external CLI. Embeds query + corpus chunks with gemini-embedding-001, ranks by cosine similarity, returns grep-format output.
When Semantic Search Helps
The core trade-off (lifted from jina-grep-cli's own docs and validated in testing):
| Task | Tool |
|---|---|
| Known exact string, filename, or regex | grep / rg / searching-codebases |
| "What files discuss concept X" when X may not appear verbatim | semantic-grep |
| Hybrid: prefilter with grep, rerank by concept | grep → rerank_candidates() |
Regression test result (workshop session corpus, 135 docs):
- "handling regulatory constraints" → top hit "Engineering AI Systems Under Sovereignty Constraints" (0.67). ✓
- "sessions about GEPA" → top hit "Gemma, DeepMind's Family of Open Models" (0.69). ✗ — false positive on phonetic neighbor. GEPA is mentioned verbatim in one session description; grep would find it correctly.
Rule: when the user query reads like a named entity or keyword, try grep first. Only reach for semantic-grep when paraphrase/concept matching is actually needed.
More from oaustegard/claude-skills
exploring-data
Exploratory data analysis using ydata-profiling. Use when users upload .csv/.xlsx/.json/.parquet files or request "explore data", "analyze dataset", "EDA", "profile data". Generates interactive HTML or JSON reports with statistics, visualizations, correlations, and quality alerts.
37sampling-bluesky-zeitgeist
DEPRECATED - Use browsing-bluesky skill instead. Sample and analyze Bluesky firehose to identify trending topics and content clusters. Use when user asks about "what's happening on Bluesky", "Bluesky trends", "zeitgeist", "firehose analysis", or wants to see real-time topic clusters from the network.
36api-credentials
Securely manages API credentials for multiple providers (Anthropic Claude, Google Gemini, GitHub). Use when skills need to access stored API keys for external service invocations.
36categorizing-bsky-accounts
Analyze and categorize Bluesky accounts by topic using keyword extraction. Use when users mention Bluesky account analysis, following/follower lists, topic discovery, account curation, or network analysis.
36creating-mcp-servers
Creates production-ready MCP servers using FastMCP v2. Use when building MCP servers, optimizing tool descriptions for context efficiency, implementing progressive disclosure for multiple capabilities, or packaging servers for distribution.
34configuring
Universal environment variable loader for AI agent environments. Loads secrets and config from Claude.ai, Claude Code, OpenAI Codex, Jules, and standard .env files.
33