19 packages tagged with “data-extraction”
.NET text extraction framework
Refinery is a tool to extract and transform semi-structured data from Excel spreadsheets of different layouts in a declarative way.
A .NET CLI tool that provides functionality for extracting data from SQL databases.
CrackDiggerEngine is a library designed to search for game information from multiple supported sites. It provides methods to fetch game details and format them into JSON-like structures. © 2025 Mustafa (@m51v5). All rights reserved. Disclaimer: This library serves as a search engine for publicly available information only. It does not store or misuse any data in violation of intellectual property or DRM policies.
Structured data extraction from unstructured text using AI with JSON schema validation and retry logic for RepletoryLib
XML data extraction plugin for the Unio library. Extract strongly-typed data from XML documents with element and XPath-based extraction. Built on System.Xml.Linq with zero additional dependencies beyond the Unio core.
JSON data extraction plugin for the Unio library. Extract strongly-typed data from JSON arrays and objects with automatic property mapping. Built on System.Text.Json with zero additional dependencies beyond the Unio core.
PDF table data extraction plugin for the Unio library. Extract strongly-typed data from PDF tables with automatic column detection and row parsing. Built on PdfPig for reliable, open-source PDF processing without commercial dependencies.
Validation plugin for the Unio library. Validate extracted data using DataAnnotations and fluent validation rules. Supports multiple error handling modes: throw on first error, skip and continue, or collect all errors. Zero additional dependencies beyond the Unio core.
Lightweight ScraperAPI client for .NET. Web scraping with automatic proxy rotation, CAPTCHA solving, and JavaScript rendering. Scrape any website without getting blocked. Features: async scraping, screenshots, country targeting, session management, auto-parsing, structured data extraction.
A modern, unified C# data extraction library. Extract strongly-typed data from CSV, Excel, PDF, JSON, and XML with a single API. Features auto-format detection, attribute-based column mapping, streaming support via IAsyncEnumerable, and built-in dependency injection. Zero external dependencies for the core library.
PageProbe is a modern, extensible .NET 8 web crawling library for extracting links, multimedia, metadata, images, and text from web pages. It supports robust crawling with depth control, robots.txt compliance, and export to multiple formats (CSV, JSON, XML, Markdown, Text). Designed for reliability, testability, and easy integration in .NET applications.
Excel (XLSX/XLS) data extraction plugin for the Unio library. Read strongly-typed data from Excel spreadsheets with automatic column mapping, sheet selection, and streaming support. Built on DocumentFormat.OpenXml for reliable, high-performance Excel processing.
Xamarin Android plugin of the Docutain Document Scanner SDK for Android and iOS. High quality document scanning, data extraction, text recognition and PDF creation for your apps. Easily scan documents in your app.
Lightweight Apify client for .NET. Web scraping and automation platform with 1,600+ ready-made actors. Run scrapers for Amazon, Google, Instagram, Twitter, and more. Features: actor execution, datasets, key-value stores, schedules, webhooks. Simple async interface with dependency injection support.
Xamarin iOS plugin of the Docutain Document Scanner SDK for Android and iOS. High quality document scanning, data extraction, text recognition and PDF creation for your apps. Easily scan documents in your app.
Docutain Document Scanner SDK for .NET MAUI (Android and iOS). High quality document scanning, data extraction, text recognition and PDF creation for your apps. Easily scan documents in your app.
Xamarin.Forms plugin of the Docutain Document Scanner SDK for Android and iOS. High quality document scanning, data extraction, text recognition and PDF creation for your apps. Easily scan documents in your app.