hebrew-nlp-toolkit
Hebrew NLP Toolkit
Instructions
Step 1: Identify the NLP Task
| Task | Recommended Model | HuggingFace ID | Size | Notes |
|---|---|---|---|---|
| Text generation (large) | DictaLM 3.0 24B Base | dicta-il/DictaLM-3.0-24B-Base |
24B | Best Hebrew generation, built on Mistral-Small-3.1-24B |
| Text generation (small) | DictaLM 3.0 Nemotron Instruct | dicta-il/DictaLM-3.0-Nemotron-12B-Instruct |
12B | Instruction-tuned, smaller footprint |
| Reasoning / chain-of-thought | DictaLM 3.0 24B Thinking | dicta-il/DictaLM-3.0-24B-Thinking |
24B | Emits explicit thinking blocks before answering |
| Lightweight / edge | DictaLM 3.0 1.7B Thinking (GGUF) | dicta-il/DictaLM-3.0-1.7B-Thinking-GGUF |
1.7B | Runs on laptop / CPU via llama.cpp |
| Classification / fill-mask | DictaBERT | dicta-il/dictabert |
184M | Fast, good accuracy |
| NER | DictaBERT NER | dicta-il/dictabert-ner |
184M | Recognizes PER, GPE, TIMEX, TTL |
| Sentiment | DictaBERT Sentiment | dicta-il/dictabert-sentiment |
184M | Hebrew sentiment classification |
| Morphology | DictaBERT Morph | dicta-il/dictabert-morph |
184M | Prefix segmentation and POS |
| Hebrew QA | DictaBERT HeQ | dicta-il/dictabert-heq |
184M | Extractive question answering |
| Embeddings (modern) | NeoDictaBERT Bilingual Embed | dicta-il/neodictabert-bilingual-embed |
400M | Hebrew-English sentence embeddings |
| Embeddings (legacy) | AlephBERT | onlplab/alephbert-base |
110M | Older baseline for similarity |
| Speech-to-text | ivrit.ai Whisper v3 | ivrit-ai/whisper-large-v3 |
1.55B | Fine-tuned on 22K+ hours of Hebrew audio |
More from skills-il/localization
hebrew-rtl-best-practices
Implement right-to-left (RTL) layouts for Hebrew web and mobile applications. Use when user asks about RTL layout, Hebrew text direction, bidirectional (bidi) text, Hebrew CSS, "right to left", or needs to build Hebrew UI. Covers CSS logical properties, Tailwind RTL, React/Next.js RTL setup, Hebrew typography, and font selection. Do NOT use for Arabic RTL (similar but different typography) unless user explicitly asks for shared RTL patterns.
87hebrew-content-writer
Write and edit professional content in Hebrew including marketing copy, UX text, articles, emails, and social media posts. Use when user asks to write in Hebrew, "ktov b'ivrit", create Hebrew marketing content, edit Hebrew text, write Hebrew UX copy, or optimize Hebrew content for SEO. Covers grammar rules, formal vs informal register, gendered language handling, and Hebrew SEO best practices. Do NOT use for Hebrew NLP/ML tasks (use hebrew-nlp-toolkit) or translation (use a translation skill).
34israeli-accessibility-compliance
Implement Israeli web accessibility compliance per IS 5568 standard and WCAG 2.1 AA for Hebrew RTL applications. Use when user asks about Israeli accessibility law, "negishot" (accessibility), IS 5568, "teken negishot" (accessibility standard), "nachim" (disabilities), Hebrew screen reader support, RTL ARIA patterns, or accessibility audit for Israeli websites. Covers mandatory legal requirements under the Equal Rights for Persons with Disabilities Act, Hebrew screen reader compatibility (NVDA, JAWS, VoiceOver), RTL-specific ARIA patterns, and penalties for non-compliance. Do NOT use for general WCAG guidance without Israeli context (use standard a11y resources instead).
30hebrew-tailwind-preset
Configure Tailwind CSS v4 for Hebrew RTL applications with dir variants, Hebrew font stacks, and logical property utilities. Use when user asks about Tailwind RTL setup, Hebrew Tailwind config, "Tailwind ivrit" (Hebrew Tailwind), RTL utility classes, logical properties in Tailwind, ms-/me- utilities, or Tailwind Hebrew font configuration. Covers Tailwind v4 dir variants, Hebrew font stack presets, logical property utilities (ms-/me-/ps-/pe- instead of ml-/mr-/pl-/pr-), RTL-first component patterns, and Hebrew typography tokens. Do NOT use for general CSS RTL patterns (use hebrew-rtl-best-practices) or full design systems (use israeli-ui-design-system instead).
27hebrew-document-generator
Generate professional Hebrew documents including PDF, DOCX, and PPTX with full RTL support and proper Hebrew typography. Use when user asks to create Hebrew PDF, generate Israeli business documents, "lehafik heshbonit", "litstor hozeh", build Hebrew Word document, create Hebrew PowerPoint, or produce Israeli templates such as Heshbonit Mas (tax invoice), Hozeh (contract), Hatza'at Mechir (proposal), or Protokol (meeting minutes). Covers reportlab, WeasyPrint, python-docx, and pptxgenjs with bidi paragraph support. Do NOT use for OCR or reading existing documents (use hebrew-ocr-forms instead).
25shabbat-aware-scheduler
Schedule meetings, deployments, and events respecting Shabbat, Israeli holidays (chagim), and Hebrew calendar constraints. Use when user asks to schedule around Shabbat, "zmanim", check Israeli holidays, plan around chagim, set Israeli business hours, or needs Hebrew calendar-aware scheduling logic. Includes halachic times (zmanim) via HebCal API, full Israeli holiday calendar, and Israeli business hour conventions. Do NOT use for religious halachic rulings (consult a rabbi) or diaspora 2-day holiday scheduling.
22