Found 61 packages
Text tokenization based on Unicode grapheme clustering, and the XID_Start and XID_Continue binary properties.
Extract tokens from a string of text for use with NLP tools or statistical analysis.
Microsoft.ML.Tokenizers contains the implmentation of the tokenization used in the NLP transforms.
Turkish Tokenization and sentence boundary detection.
Package Description
YAML grammar based on Lexepars parser lib.
Package provides three functionalities 1. Hashing using BCrypt 2. Asymmetric Key Encryption and Decryption using RSA. 3. Symmetric Key Encryption and Decryption usig Aes. 4. Produces JWT token
JSON grammar based on Lexepars parser lib.
SDK for server-side integration with CyberSource Flex API
Tokenize strings into custom tokens using ordered regex operations.
Tokenization of raw text is a standard pre-processing step for many NLP tasks. For English, tokenization usually involves punctuation splitting and separation of some affixes like possessives. Other languages require more extensive token pre-processing, which is usually called segmentation.
Concise monadic parser combinator library with separate lexer/parser phases and big-size input support.
A BERT tokenization extension for EasyReasy KnowledgeBase with FastBertTokenizer integration
TextMatch is a library for searching inside texts using Lucene query expressions. Supports all types of Lucene query expressions - boolean, wildcard, fuzzy. Options are available for tweaking tokenization, such as case-sensitivity and word stemming.
.Net (C#) Binding for Babel Street Analytics API
Autypo is a blazing-fast .NET library for fuzzy autocomplete, typo correction, and short-string search. Designed for structured data and autocomplete-first experiences, it supports per-token configuration, background indexing, and full ASP.NET Core integration out of the box.
Autypo is a blazing-fast .NET library for fuzzy autocomplete, typo correction, and short-string search. Designed for structured data and autocomplete-first experiences, it supports per-token configuration, background indexing. For full integration with ASP.NET Core, use Autypo.AspNetCore.
Text processing infrastructure for LMSupply - tokenization, vocabulary management, and encoding utilities