search

a fast, modern browser for the NuGet registry

built Mar 4, 2026 · v0.1.0

about keyboard shortcuts

NuGet is a registered trademark of Microsoft. This site is not affiliated with Microsoft.

Authaz

search

Found 50 packages

pkg

TikaOnDotnet.TextExtractor

Classes for running Apache Tika through **TikaOnDotNet**. Just use TextExtractor.Extract() and you'll be on your way.

v1.17.1

↙ 1.1M / total

pkg

PdfSharpTextExtractor

Simple Pdf text extractor based on PDFSharp. Supports both single and two-byte fonts, ToUnicode maps, Encodings. Doesn't support precise symbol positioning on page so text order can differ from the original.

GetText.NET.Extractor

Extracts string from .NET solutions and projects for GetText Catalog template files (.pot).

v10.0.1

↙ 28.5K / total

gettext.netgettextinternationalizationlocalizationi18n

GroupDocs.Parser✓

GroupDocs.Parser for .NET is a useful parsing class library which allows to extract different data from documents of various formats. The data extraction API supports PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and many more formats.

v26.2.0

↙ 1.9M / total

parserextractorextract-textparse-documentextract-images

pkg

TextExtractor

Library for text extraction. Supports doc, docx, xlsx, odt, pdf, rtf, html, rar, zip,

v1.0.0

↙ 9.2K / total

textextractiontextextractordocdocx

Bytescout.PDFExtractor

Bytescout PDF Extractor SDK for .NET, ASP.NET, ActiveX - extract data from PDF documents

v13.4.1.4801

↙ 493.6K / total

bytescoutpdfextracttextcsv

Winnovative.PdfImagesExtractor✓

Winnovative PDF Images Extractor Library for .NET (Classic) can be used in .NET Framework, .NET Core and .NET Standard applications to extract images from PDF documents. This package is compatible with .NET Framework, .NET Core and .NET Standard 2.0 on Windows platforms. For applications that need to run on both Windows and Linux platforms, you can use the Winnovative.Pdf.Next.PdfProcessor package, which allows you to extract text and images from PDF documents, search text in PDF documents and convert PDF pages to images. The compatibility list includes the following .NET versions, platforms and application types: * .NET Framework 4.0 and above * .NET 10, 9, 8, 7, 6 * .NET Standard 2.0 * Windows platforms * Azure App Service * Azure Cloud Services and Azure Virtual Machines * Web, Console and Desktop applications Main Features: * Extract images from PDF documents * Preserve transparency information from PDF documents * Extract images in memory or to image files in a folder * Save the extracted images in various image formats * Support for password-protected PDF documents * Extract images only from a range of PDF pages * Get the number of pages in a PDF document * Get the PDF document title, keywords, author and description * Does not require Adobe Reader or other third-party tools Documentation and code samples: https://www.winnovative-software.com/winnovative-pdf-images-extractor-dotnet

v14.0.0

↙ 17.9K / total

pdfimagesextractorimage

DocumentTextExtractor

Simple C# library for extracting text and metadata from .docx, .pptx, and .xlsx files

v1.0.10

↙ 3.6K / total

docxxlsxpptxwordexcel

pkg

RS.TextExtractor

Description

v1.7.0

↙ 5.2K / total

PdfTextExtractor

A simple C# shell wrapper for the wonderful pdfplumber library in Python to extract text from .PDF files

v1.0.1

↙ 3.1K / total

pdfparsingparsertextextraction

CaseLoad.TextExtractor.Text

A c# library that provides the ability to extract text from various document file formats, e.g. pdf, docx, ppt, etc...

v1.4.0

↙ 1.4K / total

textextractiondocumentspdfword

EvoPdf.PdfImagesExtractor✓

EVO PDF Images Extractor Library for .NET (Classic) can be used in .NET Framework, .NET Core and .NET Standard applications to extract images from PDF documents. This package is compatible with .NET Framework, .NET Core and .NET Standard 2.0 on Windows platforms. For applications that need to run on both Windows and Linux platforms, you can use the EvoPdf.Next.PdfProcessor package, which allows you to extract text and images from PDF documents, search text in PDF documents and convert PDF pages to images. The compatibility list includes the following .NET versions, platforms and application types: * .NET Framework 4.0 and above * .NET 10, 9, 8, 7, 6 * .NET Standard 2.0 * Windows platforms * Azure App Service * Azure Cloud Services and Azure Virtual Machines * Web, Console and Desktop applications Main Features: * Extract images from PDF documents * Preserve transparency information from PDF documents * Extract images in memory or to image files in a folder * Save the extracted images in various image formats * Support for password-protected PDF documents * Extract images only from a range of PDF pages * Get the number of pages in a PDF document * Get the PDF document title, keywords, author and description * Does not require Adobe Reader or other third-party tools Documentation and code samples: https://www.evopdf.com/evopdf-pdf-images-extractor-dotnet

v14.0.0

↙ 12.8K / total

pdfimagesextractorimageextraction

pkg

Melville.Pdf.TextExtractor

This is a renderer for Melville.PDF that extracts all of the text from a PDF page.

v0.6.4

↙ 3.0K / total

CaseLoad.TextExtractor

A c# library that provides the ability to extract text from various document file formats, e.g. pdf, docx, ppt, etc...

v1.8.0

↙ 1.5K / total

textextractdocumentwordpdf

pkg

TikaOnDotNet

Bare-bones IKVM Java-to-.NET port of Apache Tika. You'll want to install TikaOnDotNet.TextExtractor.

v1.17.1

↙ 1.6M / total

CaseLoad.TextExtractor.Abstraction

A c# library that provides the ability to extract text from various document file formats, e.g. pdf, docx, ppt, etc...

v1.4.0

↙ 1.8K / total

textextractdocumentwordpdf

CaseLoad.TextExtractor.Pdf

A c# library that provides the ability to extract text from various document file formats, e.g. pdf, docx, ppt, etc...

v1.5.0

↙ 1.6K / total

textextractiondocumentspdfword

TextExtractor-CaseLoad

A c# library that provides the ability to extract text from various document file formats, e.g. pdf, docx, ppt, etc...

v1.3.0

↙ 886 / total

textextractdocumentwordpdf

CaseLoad.TextExtractor.OpenXml

A c# library that provides the ability to extract text from various document file formats, e.g. pdf, docx, ppt, etc...

v1.5.0

↙ 1.6K / total

textextractiondocumentspdfword

ExpertPdf.PdfToText✓

The ExpertPdf Pdf to Text Converter can be used in any type of .NET application to extract the text from a PDF document. The integration with existing .NET applications is extremely easy and no installation is necessary in order to run the converter. The downloadable archive contains the assembly for .NET 2.0, .NET 4.0, .NET Core and a ready-to-use sample console application. The full C# and VB.NET source code for the sample application is available in the Samples folder. The sample application can be built with any version of Visual Studio. The result of conversion is a .NET String object that you can use for example in search operations or save into a file on disk. Features - .NET 2.0, .NET 4.0, .NET Core development library, C# and VB.NET samples - Extract text from PDF stream or a PDF file - Extract text preserving the original PDF layout - Extract text in PDF reading order - Specify the range of pages to be extracted - Save the extracted text in a HTML format and add description meta tags - Add the title, keywords, author from PDF description in HTML meta tags - Mark the page breaks in the extracted text with a special character - Extract text from password protected PDF documents - Get the number of pages in the PDF document - Search for text in PDF documents (return texts page numbers and position on page)

v8.0.0

↙ 8.5K / total

pdf-to-textpdf-text-extractorpdf-generationpdfpdf-to-text-converter

page 1next →

extraction