#scraping

80 packages tagged with “scraping”

ScrapySharp

Scraping Framework containing : - a web client able to simulate a web browser. - an HtmlAgilityPack extension to select elements using css selector (like JQuery)

v3.0.0↙ 838.5K

v3.0.0

↙ 838.5K / total

Scrapinghtmlcsslinqagility

UiPath✓

SDK for UI automation and text capture featured in UiPath Studio

v9.0.6736.25739↙ 604.0K

v9.0.6736.25739

↙ 604.0K / total

UIAutomationScreenScraperScraping

SmartReader

A .NET Standard library to extract the main content of a web page based on a port of the Readability library by Mozilla. It also determine and gather metadata about the content, such as language, author, main image, etc.

v0.11.0↙ 273.6K

v0.11.0

↙ 273.6K / total

ReadabilitySmartReaderreaderarticlescraping

OpenScraping

Turn unstructured HTML pages into structured data. The OpenScraping library can extract information from HTML pages using a JSON config file with xPath rules. It can scrape even multi-level complex objects such as tables and forum posts.

v1.4.2↙ 156.1K

v1.4.2

↙ 156.1K / total

htmlextractionscrapingscraperparser

XTract

F# screen scraping package.

v0.4.2↙ 85.5K

v0.4.2

↙ 85.5K / total

F#SCREENSCRAPINGSCRAPEHTML

dcsoup

dcsoup is a .NET library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. This library is basically a port of jsoup, a Java HTML parser library. see also: http://jsoup.org/ API reference is available at: https://raw.githubusercontent.com/matarillo/dcsoup/master/sandcastle/Help/dcsoup.chm

v1.0.0↙ 140.6K

v1.0.0

↙ 140.6K / total

HTMLjsoupjqueryparseparser

CenterCLR.SgmlReader

SgmlReader for Portable Library. SgmlReader is "SGML" markup language parser, and derived from System.Xml.XmlReader in .NET CLR. But, most popular usage the "HTML" parser. (It's scraper!!) /* Use SgmlReader in Html parse mode. */ XDocument document = SgmlReader.Parse(stream); Done!

v2018.8.31↙ 56.9K

v2018.8.31

↙ 56.9K / total

SgmlReaderParserPortableHtmlReaderHtml

ScrAPI

The only web scraping service you'll ever need that offers advanced features that are simple to use for efficient data extraction.

v1.1.0.3915↙ 46.0K

v1.1.0.3915

↙ 46.0K / total

scraperscrapingextractorcrawlerapi

pkg

google-search-results-dotnet

Search Results via SERP API. Hash, JSON, and HTML format supported for Google, Bing, Baidu, Yandex, Ebay, Google Product, Youtube, Wallmart and more...

v2.0.1↙ 140.1K

v2.0.1

↙ 140.1K / total

SearchEngineGoogleSearchResult

PickAll

.NET agile and extensible web searching API

v1.3.1↙ 22.2K

v1.3.1

↙ 22.2K / total

webscrapingapilibrary

pkg

JMovies.IMDb.Entities

JMovies IMDb Entities Library - A .Net Standart class library that provides class definitions for IMDb.

v1.8.4↙ 42.5K

v1.8.4

↙ 42.5K / total

scrapermoviesmovieimdbactor

VideoLibrary

libvideo (aka VideoLibrary) is a modern .NET library for downloading YouTube videos. It is portable to most platforms and is very lightweight. Find us on GitHub at https://github.com/omansak/libvideo

v3.3.1↙ 537.5K

v3.3.1

↙ 537.5K / total

youtubeyoutubeexplodedownloaderlibvideolib

pkg

JMovies.IMDb

JMovies IMDb Data Provider Library - Currently supports on-demand screen scraping from IMDb to receive movie and person details.

v1.8.6↙ 29.4K

v1.8.6

↙ 29.4K / total

scrapermoviesmovieimdbactor

NScrape

A web scraping framework for .NET

v0.4.1↙ 25.1K

v0.4.1

↙ 25.1K / total

htmlscrapescraperscrapingcrawler

pkg

JMovies.IMDb.Common

JMovies IMDb Common Library - A .Net Standart class library that provides constant definitions and Extension methods for IMDb scraping.

v1.8.3↙ 21.9K

v1.8.3

↙ 21.9K / total

scrapermoviesmovieimdbactor

Orobouros

A fully-featured and modular online scraper tool. Yes we know the name is spelled wrong. Icon Credit: Hyliian @ DeviantArt

v1.1.3↙ 10.2K

v1.1.3

↙ 10.2K / total

scrapingscraperframeworkmodularapi

pkg

GScraper

A collection of search engine image scrapers (Google Images, DuckDuckGo and Brave).

v1.1.3↙ 13.7K

v1.1.3

↙ 13.7K / total

gscraperscrapingweb-scrapinggooglegoogle-images

Scraping

High productivity facilitator for scraping web pages

v3.0.0.3↙ 12.9K

v3.0.0.3

↙ 12.9K / total

ScrapingC#CrawlerToolkitWeb

AnimeDl

AnimeDl scrapes animes from sites for streaming or downloading.

v1.8.9↙ 32.1K

v1.8.9

↙ 32.1K / total

animeanimedlscraperanimescraperscraping

OoplesFinance.YahooFinanceAPI✓

A C# library to be able to scrape free stock market data from Yahoo Finance. Can get historical data, top trending stocks, capital gains, dividends, stock splits, stock recommendations and so much more! Supports at least 39 different types of data and many more coming soon!

v1.7.1↙ 28.1K

v1.7.1

↙ 28.1K / total

yahoofinancestockstocksmarket

pkg

RobotsSharpParser

This client library enables working with Robots.txt. Key Features: - Parse robots.txt into Typed object. - Lookup Allowed/Disallowed/Crawldelay based on User-Agent. - Traverse sitemap in robots.txt for urls. For More info see: https://github.com/nicholasbergesen/robotsSharp/master/README.md

v2.0.0↙ 29.8K

v2.0.0

↙ 29.8K / total

robotsparserobotswebcrawling

ExcavatorSharp.WebScraper.x64

ExcavatorSharp is a multi-threaded server for scraping web data. It converts HTML code into a structured array of data. The library allows data scraping from multiple sites in parallel mode, within a single running application. Create scraping tasks and perform data extraction on a schedule. The library is designed for professional extraction and parsing of large volumes of data. Under the hood there are .css-selectors and xpath support, data export into .csv/.xlsx/.sql/.json, online data export, support for proxy servers, dynamic content crawling, interaction with the site via javascript and much more. The library uses .NET Sockets and Chromium Embedded Framework. The library can be used separately as crawler or parser. We support the formats sitemap.xml and robots.txt. We support the gzip / deflate compression. Attention! Only x64 versions are supported for .NET 4.5.2 and 4.6 platforms. AnyCPU build does not support! You will NOT be able to run the library when building AnyCPU. This is caused by the features of CEF.

v1.2.8↙ 12.2K

v1.2.8

↙ 12.2K / total

scrapingscraperhtmlwebparser

pkg