29 packages tagged with “Chunking”
.Net client for InfluxDB. Supports all InfluxDB version from 0.9 onwards. Supports both .Net 4.6.1+ and .Net Core 2.0+.
.Net client for InfluxDB
Utility functions for document and token processing. Includes TokenCounter for GPT token counting, DocumentSplitter for chunking by token limits, PartialJsonParser for streaming JSON, and helpers for checksums, file trees, and embeddings.
Core library for FluxCurator - Zero dependencies. PII masking, content filtering, and rule-based chunking with Korean language support.
Text preprocessing library for RAG pipelines: PII masking, content filtering, and intelligent chunking including semantic chunking with Korean language support.
FlowVitae is a memory and performance efficient 2D grid library written in .net designed for small to large scale procedural worlds. Can be easily integrated with most render engines.
Document storage and retrieval system for RAG (Retrieval-Augmented Generation) in DevGPT. Provides IDocumentStore interface with support for text/binary documents, chunking, metadata management, and relevancy matching. Includes file-based and memory-based backends.
Complete document processing SDK optimized for RAG systems. Transform PDF, DOCX, Excel, PowerPoint, Markdown and other formats into high-quality chunks with intelligent semantic boundary detection. Includes advanced chunking strategies, metadata extraction, and performance optimization.
The Chunking extension divides an IEnumerable into equally sized chunks and allows you to iterate over them. This may be useful when, for example, sending a large of items to a webservice, or saving a large number of items to a database, which may otherwise result in time-outs.
An unofficial .NET client library for the Unstructured API.
Package Description
Library for making multiple HTTP requests in parallel and chunking downloads.
Memory-efficient serialization with √n chunking and streaming
Document storage and retrieval system for RAG (Retrieval-Augmented Generation) in Hazina. Provides IDocumentStore interface with support for text/binary documents, chunking, metadata management, and relevancy matching. Includes file-based and memory-based backends.
Document Chunker SDK for splitting large text contents from documents like DOCX, PDF, and HTML into smaller chunks.
StyloFlow Retrieval Documents - Document chunking strategies (sliding window, semantic splits), MMR reranking, and position-based weighting for RAG and retrieval systems.
.NET bindings for oxidize-pdf - Fast, memory-safe PDF text extraction optimized for RAG/LLM pipelines with intelligent chunking.
SemanticChunker.NET delivers automatic Semantic Chunking for Retrieval-Augmented Generation in .NET. The library splits long documents into embedding-aware, context-preserving chunks that fit your LLM’s token budget. Compatible with Microsoft.Extensions.AI and Semantic Kernel, featuring four breakpoint strategies, target-chunk mode, multilingual sentence detection and token-limit safety.
A .NET CLI tool for processing large documents that exceed LLM context windows. Implements streaming content decomposition, multi-turn processing, and result aggregation.
A powerful .NET library for intelligent document structure analysis and chunking. Automatically identifies and parses various document patterns including Markdown headings, numeric outlines, legal sections, and appendices. Features hierarchical content organization, advanced keyword extraction with ML.NET, and ONNX vectorization support for semantic embeddings.
A CLI toolkit for RAG (Retrieval-Augmented Generation) workflows, providing BM25 search indexing, querying, semantic chunking, and MCP server functionality powered by Lucene.NET.
Semantic cells are data structures useful for parsing and organizing content for use in AI and analytics
Redis cache and storage provider for Zetian SMTP Server. Delivers ultra-fast, in-memory message caching and temporary storage with advanced features including automatic chunking for large messages, Redis Streams for event-driven architectures, Pub/Sub notifications for real-time updates, sorted set indexing, TTL-based expiration, and built-in compression. Ideal for high-throughput environments requiring lightning-fast message access and real-time event processing.
A .NET SDK for preprocessing web content for RAG (Retrieval-Augmented Generation) systems
Intelligent text chunking strategies for RAGify. Break down large documents into optimal-sized chunks for embedding and retrieval. Includes fixed-size chunking, sentence-aware chunking, and sliding window approaches to preserve context and improve retrieval accuracy.
Microsoft.Extensions.AI.DataIngestion integration for DocumentAtom document parsing library. Provides adapters for using DocumentAtom's document processing capabilities in standard .NET AI/RAG pipelines.
A Chunk List is a new, concurrent, chunk-based data structure that is easily modifiable and allows for fast run-time operations.