./nugetz

#searchable-pdf

33 packages tagged with “searchable-pdf

Aspose.OCR

Powerful and developer-friendly OCR API for extracting text from images and creating searchable PDFs. Add optical character recognition to on-premises solutions, web sites, cloud services, and serverless functions with just a few lines of native .NET code. Effortlessly transform scanned pages, photos, screenshots, handwritten memos, and other images into machine-readable text, regardless of the font, layout and styles. Find and compare text on images. Bulk-recognize all images from folders and archives; read multi-page PDF documents and TIFF images. Aspose.OCR is a universal solution for document processing, data extraction, and content digitization on a global scale. Supporting over 130 European, Middle East, Asian, African and American languages, the library allows you to recognize texts in Latin, Cyrillic, Arabic, Chinese, and Hindi scrips, including text in mixed languages. The library can be used virtually everywhere, catering to both small and medium businesses as well as multinational corporations. With Aspose.OCR, optical character recognition becomes a trivial and straightforward task, even for developers new to the technology. You can focus at business task rather than complex maths, neural networks, and other technical intricacies. Powerful image processing and customizable content structure detection algorithms enable text extraction from virtually any image, ranging from high-quality scans to street photos. Aspose.OCR for .NET can work with virtually any file you can get from a scanner or camera, including PDF document and multi-page TIFF images. Recognition results are returned in the most popular file and data exchange formats that can be saved, imported to a database, or analyzed in real time. Changelog: - Minor performance improvements. No public API changes. Check for details at https://releases.aspose.com/ocr/net/release-notes/2026/aspose-ocr-for-net-26-2-0-release-notes/ Resources: Product page: https://products.aspose.com/ocr/net/ Advanced OCR models: https://github.com/aspose-ocr/resources Online documentation: https://docs.aspose.com/ocr/net/ Solutions: https://docs.aspose.com/ocr/net/use-cases/ Free support forum: https://forum.aspose.com/c/ocr/16

v26.2.02.5M
OCRImage-to-textScanPhotoTesseract-alternative

Tronnx.Ocr

Tronnx.Ocr is a lightweight, open-source, fully offline OCR engine built on ONNX Runtime. It provides an end-to-end text detection + text recognition pipeline using two open-source ONNX models: • linknet_resnet18.onnx – text detection (Apache 2.0 License) • crnn_vgg16_bn_dynW.onnx – text recognition (Apache 2.0 License) Both models originate from open-source research projects and are redistributed here in compliance with their Apache 2.0 licenses. This NuGet package performs all inference locally — no network activity, no telemetry, and your images never leave your device. Key features: • 100% offline OCR pipeline • Fast ONNX Runtime inference • OpenCV-based preprocessing and detection geometry • Simple, developer-friendly API • Automatic model caching for repeated runs • Embedded ONNX models, no downloads required • Generate fully searchable PDFs from images • Highlight specific words in output PDFs Dependencies: • Microsoft.ML.OnnxRuntime (MIT License) • OpenCvSharp4 + OpenCvSharp4.runtime.win (MIT License) • PdfSharpCore (MIT License) • SkiaSharp (MIT License) Current limitations: • English-only OCR recognition (initial version) • Multi-language support coming in future releases (extended vocabularies, new recognition models, language packs) This is an initial, early version of the library and will continue to evolve. Suggestions, ideas, and contributions are welcome on the GitHub repository.

v1.2.42.0K
ocronnxonnxruntimetext-recognitiontext-detection