Found 169 packages
This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams). --------------------------------------- This library is sponsored by ZZZ Projects: https://entityframework-extensions.net/ https://eval-expression.net/ https://dapper-plus.net/ --------------------------------------- HAP is trusted by companies worldwide with over 150 million downloads.
AngleSharp is the ultimate angle brackets parser library. It parses HTML5, CSS3, and XML to construct a DOM based on the official W3C specification.
Deprecated as there's new maintainer for original HAP project. Please check the new repo at https://github.com/zzzprojects/html-agility-pack. This is a port of HtmlAgilityPack library created by Simon Mourrier and Jeff Klawiter for .NET Core platform. This NuGet package supports can be used with Universal Windows Platform, ASP.NET 5 (using .NET Core) and full .NET Framework 4.6. Original description: This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams).
Fizzler is a W3C Selectors parser and generic selector framework for document hierarchies. This package enables Fizzler over HTMLAgilityPack, adding QuerySelector and QuerySelectorAll (from Selectors API Level 1) for HtmlNode objects.
A utility to convert html document into a tree of HtmlNode elements. It can also parse the css styles and apply to the html elements
CsQuery is an HTML parser, CSS selector engine and jQuery port for .NET 4 and C#. It implements all CSS2 and CSS3 selectors, all the DOM manipulation methods of jQuery, and some of the utility methods.
HtmlPerformanceKit is a fast HTML parser.
HTML to RTF .Net is 100% C# assembly to convert HTML documents into RTF, DOCX and Text formats. Can also Merge RTF documents and replace text in them. Absolutely standalone solution, doesn't require MS Office or any other. Requires only .NET Framework 4.6.2 and up or .NET 6.0 and up. The component can read and parse all types of HTML: HTML 3.2, HTML 4.01, HTML 5 with CSS and XHTML 1.01. The component has own HTML parser, DOCX and RTF writers.
GroupDocs.Parser for .NET is a useful parsing class library which allows to extract different data from documents of various formats. The data extraction API supports PDF, DOC, DOCX, PPT, PPTX, XLS, XLSX and many more formats.
A utility library for HTML parsing related operations
This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with "real world" malformed HTML. The object model is very similar to what proposes System.Xml, but for HTML documents (or streams).
Provides a simple HTML document parser that can be used to extract information from web pages. Social Meta-data can be easily extracted from page. Information is taken from Open Graph meta data or Twitter Card meta data, as well as standard HTML meta data.
HtmlMonkey is a lightweight HTML/XML parser written in C#. It allows you to parse an HTML or XML string into a hierarchy of node objects, which can then be traversed or queried using jQuery-like selectors. The library also supports creating node objects from code and producing HTML or XML from those objects.
Adds a powerful XML and DTD parser to AngleSharp.
Free .NET HTML parser (C#) is an open source high-performance .NET C# module that was created to parse HTML for links, indexing and other purposes.
HtmlKit is a cross-platform .NET framework for parsing HTML.
HTML Parsing and Sanitizing utility. Convert HTML to XHTML
A simple .net assembly to use to parse Open Graph information from either a URL or an HTML snippet. You can read more about the Open Graph protocol @ http://ogp.me.
Can be used to parse Markdown documents and transforms them to other formats. Rendering architecture is pluggable, extensible and customizable. This library includes rendering to HTML, plain text, and portable Markdown. Additional libraries provide rendering support to other formats. The library can also compare Markdown documents, and provide Markdown-based difference documents, showing how one version of a document is edited to produce a second version. For a description of the Markdown flavour supported by the parser, see: https://waher.se/Markdown.md