53 packages tagged with “extractor”
RecursiveExtractor is able to process the following formats: ar, bzip2, deb, gzip, iso, tar, vhd, vhdx, vmdk, wim, xzip, and zip. RecursiveExtractor automatically detects the archive type and fails gracefully when attempting to process malformed content.
An extractor from Audit history for entities fields.
A tool to extract the contents of a single file application to a directory.
GroupDocs.Parser for .NET is a useful parsing class library which allows to extract different data from documents of various formats. The data extraction API supports PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and many more formats.
A library to programmatically extract the contents of a single file application to a directory.
The only web scraping service you'll ever need that offers advanced features that are simple to use for efficient data extraction.
C# port of the LinkedIn engineering team's open source Java library to detect urls from a body of text. See https://github.com/eladaus/URL-Detector
crawler framework , distributed crawler extractor. try ruiji scraper --- chrome web crawler https://chrome.google.com/webstore/detail/ruiji-scraper/klhahkhllngppofpkjdlbmnglnmnbbol?hl=zh-CN&authuser=0
Extracting Document Templates From Dynamics Crm
.NET Core library used for extracting information about Youtube videos
Dimension Parser Library
enterprise data extractor for PDF word or excel
libvideo (aka VideoLibrary) is a modern .NET library for downloading YouTube videos. It is portable to most platforms and is very lightweight. Find us on GitHub at https://github.com/omansak/libvideo
Extract Specific Contents from an XML using Property Attributes.
Finds localizable messages in *.fs and *.cs files by looking for calls such as I18n.Translate("message") in those sources. Puts unique messages into specified JSON file (updates it if neccessary). Class name, method name and other things are configurable
Extract data from Palworld .pak file
A collection of methods for injecting or extracting icons from files on Windows operating systems.
MSBuild task for extracting CSS classes from C# code
GKZipLib was written for fast parsing of ZIP archives generated by GrayKey in .NET. Publicly available parsing libraries I tried in C# were either too slow at parsing large ZIPs or completely failed when attempting to parse GK zips (or both). Developing this library was a fantastic exercise that really enhanced my own personal understanding of how zip files work. One of the big focuses of this library is being as fast as possible. Let's keep in mind GK zips can go from 5-10 GB to hundreds of GB in size. So how do we keep it fast? GKZipLib accomplishes this by parsing ONLY as much as it needs to, unless a file is identified (by path, etc) as needing to be extracted. First, it parses the entire central directory into RAM. The CD is typically quite small so this is doable. On a file-by-file basis, you can then decide whether or not to load additional details such as the data's absolute offset within the file, the file's compressed/uncompressed size, and so on. Probably the most potent usage of this is what I'm going to call "LINQ to GKZip" -- taking advantage of the fact that the library implements IEnumerable and thus can be called with a simple foreach. Please see Example.cs for the simplest usage. Contact the author on Discord - forensicmike#6426 or Twitter DM @forensicmike1
Word Extractor is a simple tool for extracting nouns, verbs, adjectives, adverbs, and more from text. It returns the extracted words based on the specified part of speech.
Abstractions for the PalworldDataExtractor.Lib project
Deterministic extractor atom that turns raw content into stable semantic units.
Automates the extraction of icons and images from websites. A .NET Standard Library.
Object Parser
A CLI for nested zip file extractor
Samayas Tools MIB Extractor.