80 packages tagged with “scraping”
Scraping Framework containing : - a web client able to simulate a web browser. - an HtmlAgilityPack extension to select elements using css selector (like JQuery)
SDK for UI automation and text capture featured in UiPath Studio
A .NET Standard library to extract the main content of a web page based on a port of the Readability library by Mozilla. It also determine and gather metadata about the content, such as language, author, main image, etc.
Turn unstructured HTML pages into structured data. The OpenScraping library can extract information from HTML pages using a JSON config file with xPath rules. It can scrape even multi-level complex objects such as tables and forum posts.
F# screen scraping package.
dcsoup is a .NET library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. This library is basically a port of jsoup, a Java HTML parser library. see also: http://jsoup.org/ API reference is available at: https://raw.githubusercontent.com/matarillo/dcsoup/master/sandcastle/Help/dcsoup.chm
SgmlReader for Portable Library. SgmlReader is "SGML" markup language parser, and derived from System.Xml.XmlReader in .NET CLR. But, most popular usage the "HTML" parser. (It's scraper!!) /* Use SgmlReader in Html parse mode. */ XDocument document = SgmlReader.Parse(stream); Done!
The only web scraping service you'll ever need that offers advanced features that are simple to use for efficient data extraction.
Search Results via SERP API. Hash, JSON, and HTML format supported for Google, Bing, Baidu, Yandex, Ebay, Google Product, Youtube, Wallmart and more...
.NET agile and extensible web searching API
JMovies IMDb Entities Library - A .Net Standart class library that provides class definitions for IMDb.
libvideo (aka VideoLibrary) is a modern .NET library for downloading YouTube videos. It is portable to most platforms and is very lightweight. Find us on GitHub at https://github.com/omansak/libvideo
JMovies IMDb Data Provider Library - Currently supports on-demand screen scraping from IMDb to receive movie and person details.
A web scraping framework for .NET
JMovies IMDb Common Library - A .Net Standart class library that provides constant definitions and Extension methods for IMDb scraping.
A fully-featured and modular online scraper tool. Yes we know the name is spelled wrong. Icon Credit: Hyliian @ DeviantArt
A collection of search engine image scrapers (Google Images, DuckDuckGo and Brave).
High productivity facilitator for scraping web pages
AnimeDl scrapes animes from sites for streaming or downloading.
A C# library to be able to scrape free stock market data from Yahoo Finance. Can get historical data, top trending stocks, capital gains, dividends, stock splits, stock recommendations and so much more! Supports at least 39 different types of data and many more coming soon!
This client library enables working with Robots.txt. Key Features: - Parse robots.txt into Typed object. - Lookup Allowed/Disallowed/Crawldelay based on User-Agent. - Traverse sitemap in robots.txt for urls. For More info see: https://github.com/nicholasbergesen/robotsSharp/master/README.md
ExcavatorSharp is a multi-threaded server for scraping web data. It converts HTML code into a structured array of data. The library allows data scraping from multiple sites in parallel mode, within a single running application. Create scraping tasks and perform data extraction on a schedule. The library is designed for professional extraction and parsing of large volumes of data. Under the hood there are .css-selectors and xpath support, data export into .csv/.xlsx/.sql/.json, online data export, support for proxy servers, dynamic content crawling, interaction with the site via javascript and much more. The library uses .NET Sockets and Chromium Embedded Framework. The library can be used separately as crawler or parser. We support the formats sitemap.xml and robots.txt. We support the gzip / deflate compression. Attention! Only x64 versions are supported for .NET 4.5.2 and 4.6 platforms. AnyCPU build does not support! You will NOT be able to run the library when building AnyCPU. This is caused by the features of CEF.
This .NET Standard package provides convenient access to the Local API REST interface of the Kameleo Client.
Trendyol Mağaza bilgilerini Web Scraping ile okuyup modellediğimiz bir kütüphanedir.
Client for source providers.
A library that automate searching of academic articles on Google-Scholar and Research-Gate
Puppeteer and Chrome DOMSnapshot API wrapper
A simple mark scraper / api for TeachAssist.
A base library for implementing imageboard scrapers
Scraping without the hassle. Capture screenshots of webpages and scrape contents with a single line of code.