trending/tags

#tokenizer

84 packages tagged with “tokenizer”

pkg

Tokenizer

Tokenizer extracts structured information from blocks of text and reflects them onto .NET objects

v2.2.2↙ 488.3K

v2.2.2

↙ 488.3K / total

tokenizertokenspatternmatching

FastBertTokenizer

Fast and memory-efficient WordPiece tokenizer as it is used by BERT and others. Tokenizes text for further processing using NLP/language models.

v1.0.28↙ 688.5K

v1.0.28

↙ 688.5K / total

berttokenizerwordpiecellmsemantic-kernel

Jokenizer.Net

C# Expression parser and evaluator, inspired from jokenizer project.

v1.2.8↙ 175.4K

v1.2.8

↙ 175.4K / total

csharpexpressionparsertokenizer

Microsoft.KernelMemory.AI.Tiktoken✓

Provide tokenizers to allow counting content tokens for text and embeddings

v0.98.250508.3+bd8d34e↙ 715.3K

v0.98.250508.3+bd8d34e

↙ 715.3K / total

TiktokenTokenizerRAGKernelMemory

pkg

AI.Dev.OpenAI.GPT

OpenAI GPT utils, e.g. GPT3 Tokenizer

v1.0.2↙ 226.2K

v1.0.2

↙ 226.2K / total

OpenAIGPTGPT3TokenizerTokens

BERTTokenizers

This package contains tokenizers for following models: · BERT Base · BERT Large · BERT German · BERT Multilingual · BERT Base Uncased · BERT Large Uncased

v1.2.0↙ 114.2K

v1.2.0

↙ 114.2K / total

BERTTokenizercharpdotnet

pkg

LLMSharp.Anthropic.Tokenizer

Anthropic Claude BPE Tokenizer unofficial implementation

v2.0.3↙ 125.2K

v2.0.3

↙ 125.2K / total

anthropictokenizerclaudedotnettiktoken

OpenNLP.NET

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also included maximum entropy and perceptron based machine learning.

v1.9.4.1↙ 139.9K

v1.9.4.1

↙ 139.9K / total

NLPOpenNLPTokenizerCategorizerChunker

pkg

Loretta.CodeAnalysis.Common

A shared package used by Loretta. Do not install this package manually, it will be added as a prerequisite by other packages that require it.

v0.2.13↙ 67.7K

v0.2.13

↙ 67.7K / total

parserlexerparsinglexingtokenizer

pkg

Loretta.CodeAnalysis.Lua

A GLua/Lua lexer, parser, code analysis, transformation and generation library.

v0.2.13↙ 66.8K

v0.2.13

↙ 66.8K / total

parserlexerparsinglexingtokenizer

MSL.Lexi

Lexi: A regular expression based lexer for dotnet.

v2.2.2↙ 94.0K

v2.2.2

↙ 94.0K / total

lexerMSLlexitokenizerlexical

pkg

Tokenizers.DotNet

.NET wrapper of HuggingFace Tokenizers library

v1.4.0↙ 28.2K

v1.4.0

↙ 28.2K / total

tokenizersrusthuggingfacetokenizergpt

SoftCircuits.Parsing.Helper✓

A .NET class library that makes it easier to parse text. The library tracks the current position within the text, ensures your code never attempts to access a character at an invalid index, and includes many methods that make parsing easier. The library makes your text-parsing code more concise and more robust. Includes support for regular expressions.

v5.2.0↙ 47.9K

v5.2.0

↙ 47.9K / total

textparseparserparsingtext-parser

pkg

Tokenizers.DotNet.runtime.win-x64

Native(Rust) wrapper of HuggingFace Tokenizers library.

v1.4.0↙ 20.1K

v1.4.0

↙ 20.1K / total

tokenizersrusthuggingfacetokenizergpt

pkg

Tokenizers.DotNet.runtime.linux-x64

Native(Rust) wrapper of HuggingFace Tokenizers library.

v1.4.0↙ 10.5K

v1.4.0

↙ 10.5K / total

tokenizersrusthuggingfacetokenizergpt

pkg

Tokenizers.HuggingFace

Bindings for rust huggingface/tokenizers.

v2.21.4↙ 8.1K

v2.21.4

↙ 8.1K / total

tokenizerhuggingfacebindings

pkg

LLMSharp.OpenAi.Tokenizer

Open AI Chat Completion Models (GPT 3.5/GPT 4) BPE Tokenizer unofficial implementation

Loretta.CodeAnalysis.Lua.Experimental

Experimental code that might become part of Loretta.CodeAnalysis.Lua.

v0.2.13↙ 14.5K

v0.2.13

↙ 14.5K / total

parserlexerparsinglexingtokenizer

pkg

Conekta.Xamarin

Tokenizador (Xamarin) para Conekta. Necesitas tener alguna libreria de servidor para usar el token.

v1.0.6↙ 14.7K

v1.0.6

↙ 14.7K / total

borisscheimanbscheimanconektaxamarin

pkg

AllMiniLmL6V2Sharp

NET Standard 2.1 library to produces embeddings using C# Bert Tokenizer and Onnx All-Mini-LM-L6-v2 model.

v0.0.3↙ 10.9K

v0.0.3

↙ 10.9K / total

embeddingsall-mini-lm-l6-v2tokenizerBERTSentence

NltkNet

NLTK python library wrapper for .NET

v1.0.14↙ 17.8K

v1.0.14

↙ 17.8K / total

nltknlpparsertokenizercategorizer

pkg

VBF.Compilers.Scanners

VBF.Compilers.Scanners is a scanner builder. It contains a regular expression to DFA engine, can generate high performance scanners for unicode source text.

v1.0.5↙ 8.3K

v1.0.5

↙ 8.3K / total

VBFCompilerscannerlexerlex

pkg

Trl.PegParser

Trl.PegParser contains a tokenizer and a parser. The tokenizer uses regular expressions to define tokens, and exposes both matched and unmatched character ranges. The PEG Parser uses parsing expression grammers with tokens produced by the tokenizer. Trl.PegParser is build on .NET Standard 2.1 for cross-platform compatibility.

v1.3.0↙ 7.1K

v1.3.0

↙ 7.1K / total

TokenizerPEGGrammerRegular_ExpressionsRegex

pkg

MetaParser

Package Description

v1.0.7↙ 6.0K

v1.0.7

↙ 6.0K / total

parsingtokenizationtokenizertextparser

pkg

Stringes

The Stringe is a wrapper for the .NET String object that tracks line, column, offset, and other metadata for substrings.

v1.4.8↙ 21.6K

v1.4.8

↙ 21.6K / total

stringestokenizertokenlexerstring

pkg

Tortoise

Package Description

v0.1.1↙ 3.7K

v0.1.1

↙ 3.7K / total

lexerlexicaltokenizer

NReco.NLQuery✓

NLQuery: natural language query parser recognizes entities in context of structured sources (like tabular dataset). Can be used for building natural language interface to SQL database or OLAP cube, implementing custom app-specific search. Usage examples: https://www.nrecosite.com/nlp_ner_net.aspx Online demo: http://nlquery.nrecosite.com/

v1.2.1↙ 17.8K

v1.2.1

↙ 17.8K / total

NLPNERNLQsearchsearch-interface

pkg