syncfusion-dotnet-smart-data-extraction
Installation
SKILL.md
Smart Data Extractor — Syncfusion
Overview
Extracts complete document structures from PDFs and images files using the Syncfusion SmartDataExtractor Library. This skill supports one operational mode — generating C# code for the user's project.
Key Capabilities
- Document structure extraction: Identify text elements, images, headers, footers, and tables (including regions, header rows, columns, cell boundaries, and merged cells).
- File format support: Works with PDF documents and common image formats such as JPEG and PNG.
- Table extraction: Specialized capability to extract tabular data.
- Form recognition: Detects and processes structured form data.
- Page-level control: Extract data from specific pages or defined page ranges.
- Confidence threshold: Results are filtered based on a configurable confidence score (0.0–1.0).
Prerequisites
- Install required runtime and library packages from NuGet before running extraction.