regex-vs-llm-structured-text
Hybrid regex-and-LLM framework for parsing structured text, optimizing cost by handling 95–98% with regex and reserving LLM calls for edge cases.
- Combines regex extraction with confidence scoring to flag low-confidence items, then validates only those items with an LLM, reducing LLM calls by ~95% versus all-LLM approaches
- Includes production-ready Python patterns for regex parsing, confidence scoring, and hybrid pipeline orchestration with real metrics from a 410-item quiz parsing example
- Best suited for structured, repeating text patterns like quizzes, forms, invoices, and documents where deterministic extraction is possible
- Emphasizes test-driven development, immutable data structures, and metric logging to track pipeline health and identify when regex thresholds degrade
Regex vs LLM for Structured Text Parsing
A practical decision framework for parsing structured text (quizzes, forms, invoices, documents). The key insight: regex handles 95-98% of cases cheaply and deterministically. Reserve expensive LLM calls for the remaining edge cases.
When to Activate
- Parsing structured text with repeating patterns (questions, forms, tables)
- Deciding between regex and LLM for text extraction
- Building hybrid pipelines that combine both approaches
- Optimizing cost/accuracy tradeoffs in text processing
Decision Framework
Is the text format consistent and repeating?
├── Yes (>90% follows a pattern) → Start with Regex
│ ├── Regex handles 95%+ → Done, no LLM needed
│ └── Regex handles <95% → Add LLM for edge cases only
└── No (free-form, highly variable) → Use LLM directly
More from affaan-m/everything-claude-code
security-review
Use this skill when adding authentication, handling user input, working with secrets, creating API endpoints, or implementing payment/sensitive features. Provides comprehensive security checklist and patterns.
7.9Kgolang-patterns
Idiomatic Go patterns, best practices, and conventions for building robust, efficient, and maintainable Go applications.
7.4Kcoding-standards
Baseline cross-project coding conventions for naming, readability, immutability, and code-quality review. Use detailed frontend or backend skills for framework-specific patterns.
6.7Kfrontend-patterns
Frontend development patterns for React, Next.js, state management, performance optimization, and UI best practices.
6.6Kbackend-patterns
Backend architecture patterns, API design, database optimization, and server-side best practices for Node.js, Express, and Next.js API routes.
6.6Kgolang-testing
Go testing patterns including table-driven tests, subtests, benchmarks, fuzzing, and test coverage. Follows TDD methodology with idiomatic Go practices.
6.1K