trending/tags

#Chunking

29 packages tagged with “Chunking”

pkg

AdysTech.InfluxDB.Client.Net.Core

.Net client for InfluxDB. Supports all InfluxDB version from 0.9 onwards. Supports both .Net 4.6.1+ and .Net Core 2.0+.

v0.25.0↙ 427.1K

v0.25.0

↙ 427.1K / total

InfluxDBInfluxTSDBTimeSeriesInfluxData

pkg

AdysTech.InfluxDB.Client.Net

.Net client for InfluxDB

v0.15.0↙ 80.0K

v0.15.0

↙ 80.0K / total

InfluxDBInfluxTSDBTimeSeriesInfluxData

pkg

DevGPT.LLMs.Helpers

Utility functions for document and token processing. Includes TokenCounter for GPT token counting, DocumentSplitter for chunking by token limits, PartialJsonParser for streaming JSON, and helpers for checksums, file trees, and embeddings.

v1.1.3↙ 5.0K

v1.1.3

↙ 5.0K / total

llmtokensdocument-splittingchunkingutilities

FluxCurator.Core

Core library for FluxCurator - Zero dependencies. PII masking, content filtering, and rule-based chunking with Korean language support.

v0.7.2↙ 6.7K

v0.7.2

↙ 6.7K / total

ragchunkingpiimaskingnlp

FluxCurator

Text preprocessing library for RAG pipelines: PII masking, content filtering, and intelligent chunking including semantic chunking with Korean language support.

v0.7.2↙ 6.8K

v0.7.2

↙ 6.8K / total

ragchunkingpiimaskingnlp

pkg

Venomaus.FlowVitae

FlowVitae is a memory and performance efficient 2D grid library written in .net designed for small to large scale procedural worlds. Can be easily integrated with most render engines.

v1.3.9↙ 14.5K

v1.3.9

↙ 14.5K / total

2dgridflowvitaeefficientmemory

pkg

DevGPT.Store.DocumentStore

Document storage and retrieval system for RAG (Retrieval-Augmented Generation) in DevGPT. Provides IDocumentStore interface with support for text/binary documents, chunking, metadata management, and relevancy matching. Includes file-based and memory-based backends.

v1.1.3↙ 4.2K

v1.1.3

↙ 4.2K / total

ragdocument-storeretrievalstoragechunking

FileFlux

Complete document processing SDK optimized for RAG systems. Transform PDF, DOCX, Excel, PowerPoint, Markdown and other formats into high-quality chunks with intelligent semantic boundary detection. Includes advanced chunking strategies, metadata extraction, and performance optimization.

v0.10.5↙ 15.2K

v0.10.5

↙ 15.2K / total

ragdocumentprocessingchunkingai

pkg

ReneWiersma.Chunking

The Chunking extension divides an IEnumerable into equally sized chunks and allows you to iterate over them. This may be useful when, for example, sending a large of items to a webservice, or saving a large number of items to a database, which may otherwise result in time-outs.

v1.0.1↙ 1.3K

v1.0.1

↙ 1.3K / total

ChunkchunkschunkingIEnumerableextension

pkg

Bash.UnstructuredIO.Client

An unofficial .NET client library for the Unstructured API.

v1.0.0↙ 490

v1.0.0

↙ 490 / total

unstruturedchunkingclient

pkg

FastCdcFs.Net.Shell

Package Description

v1.0.2↙ 2.4K

v1.0.2

↙ 2.4K / total

readonlyvirtual-filesystemarchivestoragededuplication

pkg

Hazina.LLMs.Helpers

v2.0.0↙ 1.1K

v2.0.0

↙ 1.1K / total

llmtokensdocument-splittingchunkingutilities

pkg

FastCdcFs.Net

Package Description

v1.0.2↙ 2.3K

v1.0.2

↙ 2.3K / total

readonlyvirtual-filesystemarchivestoragededuplication

pkg

Gravy.MultiHttp

Library for making multiple HTTP requests in parallel and chunking downloads.

v0.7.0↙ 373

v0.7.0

↙ 373 / total

httpparalleldownloadchunking

SqrtSpace.SpaceTime.Serialization

Memory-efficient serialization with √n chunking and streaming

v1.0.1↙ 567

v1.0.1

↙ 567 / total

serializationstreamingchunkingcompressionspacetime

pkg

Hazina.Store.DocumentStore

Document storage and retrieval system for RAG (Retrieval-Augmented Generation) in Hazina. Provides IDocumentStore interface with support for text/binary documents, chunking, metadata management, and relevancy matching. Includes file-based and memory-based backends.

v2.0.0↙ 710

v2.0.0

↙ 710 / total

ragdocument-storeretrievalstoragechunking

pkg

DocumentChunker

Document Chunker SDK for splitting large text contents from documents like DOCX, PDF, and HTML into smaller chunks.

v1.0.0↙ 746

v1.0.0

↙ 746 / total

DocumentChunkerDocumentChunkingDOCXPDF

Mostlylucid.StyloFlow.Retrieval.Documents

StyloFlow Retrieval Documents - Document chunking strategies (sliding window, semantic splits), MMR reranking, and position-based weighting for RAG and retrieval systems.

v2.4.0↙ 514

v2.4.0

↙ 514 / total

retrievaldocumentschunkingragmmr

OxidizePdf.NET

.NET bindings for oxidize-pdf - Fast, memory-safe PDF text extraction optimized for RAG/LLM pipelines with intelligent chunking.

v0.2.2↙ 1.3K

v0.2.2

↙ 1.3K / total

pdfextractionragllmkernelmemory

SemanticChunker.NET

SemanticChunker.NET delivers automatic Semantic Chunking for Retrieval-Augmented Generation in .NET. The library splits long documents into embedding-aware, context-preserving chunks that fit your LLM’s token budget. Compatible with Microsoft.Extensions.AI and Semantic Kernel, featuring four breakpoint strategies, target-chunk mode, multilingual sentence detection and token-limit safety.

v1.5.0↙ 6.0K

v1.5.0

↙ 6.0K / total

SemanticChunkingRAGRetrievalAugmented

pkg

rlm

A .NET CLI tool for processing large documents that exceed LLM context windows. Implements streaming content decomposition, multi-turn processing, and result aggregation.

v1.0.6↙ 490

v1.0.6

↙ 490 / total

clillmdocument-processingchunkingrlm

pkg

MarkdownStructureChunker

A powerful .NET library for intelligent document structure analysis and chunking. Automatically identifies and parses various document patterns including Markdown headings, numeric outlines, legal sections, and appendices. Features hierarchical content organization, advanced keyword extraction with ML.NET, and ONNX vectorization support for semantic embeddings.

v1.0.7↙ 2.5K

v1.0.7

↙ 2.5K / total

markdowndocumentparsingchunkingstructure

pkg

Pixelbadger.Toolkit.Rag

A CLI toolkit for RAG (Retrieval-Augmented Generation) workflows, providing BM25 search indexing, querying, semantic chunking, and MCP server functionality powered by Lucene.NET.

v1.4.0↙ 629

v1.4.0

↙ 629 / total

clitoolkitragsearchlucene

SemanticCells

Semantic cells are data structures useful for parsing and organizing content for use in AI and analytics

v0.1.1↙ 445

v0.1.1

↙ 445 / total

semanticcellchunkchunkingai

Zetian.Storage.Redis✓

Redis cache and storage provider for Zetian SMTP Server. Delivers ultra-fast, in-memory message caching and temporary storage with advanced features including automatic chunking for large messages, Redis Streams for event-driven architectures, Pub/Sub notifications for real-time updates, sorted set indexing, TTL-based expiration, and built-in compression. Ideal for high-throughput environments requiring lightning-fast message access and real-time event processing.

v1.0.4↙ 1.5K

v1.0.4

↙ 1.5K / total

rediscachein-memoryhigh-performancechunking

WebFlux

A .NET SDK for preprocessing web content for RAG (Retrieval-Augmented Generation) systems

v0.4.2↙ 3.4K

v0.4.2

↙ 3.4K / total

RAGAIWebCrawlingContent

RAGify.Chunking

Intelligent text chunking strategies for RAGify. Break down large documents into optimal-sized chunks for embedding and retrieval. Includes fixed-size chunking, sentence-aware chunking, and sliding window approaches to preserve context and improve retrieval accuracy.

v1.0.0↙ 111

v1.0.0

↙ 111 / total

ragretrieval-augmented-generationembeddingsvector-searchnlp

DocumentAtom.DataIngestion

Microsoft.Extensions.AI.DataIngestion integration for DocumentAtom document parsing library. Provides adapters for using DocumentAtom's document processing capabilities in standard .NET AI/RAG pipelines.

v3.0.0↙ 300

v3.0.0

↙ 300 / total

documentparsingingestionragai

pkg

ChunkList

A Chunk List is a new, concurrent, chunk-based data structure that is easily modifiable and allows for fast run-time operations.

v1.0.0↙ 350

v1.0.0

↙ 350 / total

listdatastructuresruntimeconcurrencyparallel