Found 21 packages
The fastest tokenizer for GPT-3.5 and GPT-4 inspired by Tiktoken.
The fastest tokenizer for GPT-3.5 and GPT-4 inspired by Tiktoken.
The fastest tokenizer for GPT-3.5 and GPT-4 inspired by Tiktoken.
The fastest tokenizer for GPT-3.5 and GPT-4 inspired by Tiktoken.
The fastest tokenizer for GPT-3.5 and GPT-4 inspired by Tiktoken.
Provide tokenizers to allow counting content tokens for text and embeddings
Token calculation for OpenAI models.
Anthropic Claude BPE Tokenizer unofficial implementation
The Microsoft.ML.Tokenizers.Data.O200kBase includes the Tiktoken tokenizer data file o200k_base.tiktoken, which is utilized by models such as gpt-4o.
The fastest tokenizer for GPT-3.5 and GPT-4 inspired by Tiktoken.
The fastest tokenizer for GPT-3.5 and GPT-4 inspired by Tiktoken.
The Microsoft.ML.Tokenizers.Data.Cl100kBase class includes the Tiktoken tokenizer data file cl100k_base.tiktoken, which is utilized by models such as GPT-4.
The fastest tokenizer for GPT-3.5 and GPT-4 inspired by Tiktoken. This fork contains a printable version of Explore()
The fastest tokenizer for gpt2, multilingual and whisper inspired by Tiktoken.
The Microsoft.ML.Tokenizers.Data.P50kBase includes the Tiktoken tokenizer data file p50k_base.tiktoken, which is utilized by models such as text-davinci-002
Open AI Chat Completion Models (GPT 3.5/GPT 4) BPE Tokenizer unofficial implementation
Semchunk.Net tokenizer wrapper for the Tiktoken package.
Package Description
The Microsoft.ML.Tokenizers.Data.Gpt2 includes the Tiktoken tokenizer data file gpt2.tiktoken, which is utilized by models such as Gpt-2.
The Microsoft.ML.Tokenizers.Data.R50kBase includes the Tiktoken tokenizer data file r50k_base.tiktoken, which is utilized by models such as text-davinci-001