TokenKit — a .NET toolkit for tokenization, cost estimation, and AI model registry management. Supports GPT-4, GPT-4o, GPT-5, and other LLMs with Microsoft.ML.Tokenizers and SharpToken integration.
$ dotnet add package TokenKitTokenKit — A professional .NET 8.0 library and CLI for tokenization, validation, cost estimation, and model registry management across multiple LLM providers (OpenAI, Anthropic, Gemini, etc.).
| Category | Description |
|---|---|
| 🔢 Tokenization | Analyze text or files and count tokens using multiple encoder engines (simple, SharpToken, ML.Tokenizers) |
| 💰 Cost Estimation | Automatically calculate estimated API cost based on model metadata |
| ✅ Prompt Validation | Validate prompt length against model context limits |
| 🧩 Model Registry | Manage model metadata (maxTokens, pricing, encodings, providers) via JSON registry |
| ⚙️ CLI & SDK | Use TokenKit as a .NET library or a global CLI tool |
| 🧮 Multi-Encoder Support | Dynamically select tokenization engines via --engine flag |
| 📦 Self-contained Data | Local registry stored in Registry/models.data.json, auto-updatable |
| 🔍 Live Model Scraper | Optional OpenAI API key support to fetch real-time model data |
| 📊 Structured Logging | All CLI commands logged to tokenkit.log with rotation (1MB max) |
| 🤫 Quiet & JSON Modes | Machine-readable (--json) and silent (--quiet) output modes for automation |
| 🎨 CLI Polish | Colorized output, ASCII banner, and improved user experience |
dotnet add package TokenKit
dotnet tool install -g TokenKit
tokenkit analyze "Hello from TokenKit!" --model gpt-4o
tokenkit analyze prompt.txt --model gpt-4o
echo "This is piped text input" | tokenkit analyze --model gpt-4o
Example Output:
{
"Model": "gpt-4o",
"Provider": "OpenAI",
"TokenCount": 4,
"EstimatedCost": 0.00002,
"Valid": true
}
tokenkit validate "A very long prompt to validate" --model gpt-4o
{
"IsValid": true,
"Message": "OK"
}
tokenkit models list
tokenkit models list --provider openai
tokenkit models list --json
tokenkit update-models
tokenkit update-models --openai-key sk-xxxx
cat newmodels.json | tokenkit update-models
Example Input:
[
{
"Id": "gpt-4o-mini",
"Provider": "OpenAI",
"MaxTokens": 64000,
"InputPricePer1K": 0.002,
"OutputPricePer1K": 0.01,
"Encoding": "cl100k_base"
}
]
tokenkit scrape-models --openai-key sk-xxxx
If no key is provided, TokenKit uses the local offline model registry.
Example Output:
🔍 Fetching latest OpenAI model data...
✅ Retrieved 3 models:
- OpenAI: gpt-4o (128000 tokens)
- OpenAI: gpt-4o-mini (64000 tokens)
- OpenAI: gpt-3.5-turbo (4096 tokens)
tokenkit analyze "Hello" --model gpt-4o --json
Outputs pure JSON:
{
"Model": "gpt-4o",
"Provider": "OpenAI",
"TokenCount": 7,
"EstimatedCost": 0.000105,
"Engine": "simple",
"Valid": true
}
tokenkit analyze "Silent test" --model gpt-4o --quiet
No console output. Log entry saved to tokenkit.log.
using TokenKit.Registry;
using TokenKit.Services;
var model = ModelRegistry.Get("gpt-4o");
var tokenizer = new TokenizerService();
var result = tokenizer.Analyze("Hello from TokenKit!", model!.Id);
var cost = CostEstimator.Estimate(model, result.TokenCount);
Console.WriteLine($"Tokens: {result.TokenCount}, Cost: ${cost}");
TokenKit stores all model metadata in:
Registry/models.data.json
Each entry includes:
{
"Id": "gpt-4o",
"Provider": "OpenAI",
"MaxTokens": 128000,
"InputPricePer1K": 0.005,
"OutputPricePer1K": 0.015,
"Encoding": "cl100k_base"
}
TokenKit maintains 100% test coverage using xUnit and Codecov.
Run tests locally:
dotnet test --collect:"XPlat Code Coverage"
| Feature | Description |
|---|---|
| 🌐 Extended Provider Support | Add Gemini, Claude, and Mistral integrations |
| 💾 Persistent Config Profiles | Store model defaults and pricing overrides per project |
| 🧮 Batch Analysis | Analyze multiple files or prompts in a single command |
| 📊 Report Generation | Export CSV/JSON summaries of token usage and estimated cost |
| 🧠 LLM-Aware Cost Planner | Simulate conversation cost across multi-turn dialogues |
| 🧩 IDE Integrations | VS Code and JetBrains plugins for inline token analysis |
| ⚙️ Custom Encoders | Support community-built encoders and language models |
Licensed under the MIT License.
© 2025 Andrew Clements — Flow Labs / TokenKit