A .NET library for local LLM text generation using ONNX Runtime GenAI. Supports streaming, chat templates, and automatic hardware detection with CUDA/DirectML GPU acceleration.
$ dotnet add package LMSupply.GeneratorLocal text generation and chat with ONNX Runtime GenAI and GGUF (llama-server).
using LMSupply.Generator;
// Using the builder pattern
var generator = await TextGeneratorBuilder.Create()
.WithDefaultModel()
.BuildAsync();
// Generate text
string response = await generator.GenerateCompleteAsync("What is AI?");
Console.WriteLine(response);
await generator.DisposeAsync();
var messages = new[]
{
new ChatMessage(ChatRole.System, "You are a helpful assistant."),
new ChatMessage(ChatRole.User, "Explain quantum computing.")
};
string response = await generator.GenerateChatCompleteAsync(messages);
| Model | Parameters | License | Description |
|---|---|---|---|
| Phi-4 Mini | 3.8B | MIT | Default, best balance |
| Phi-3.5 Mini | 3.8B | MIT | Fast, reliable |
| Phi-4 | 14B | MIT | Highest quality |
| Llama 3.2 1B | 1B | Conditional | Ultra-lightweight |
| Llama 3.2 3B | 3B | Conditional | Balanced |
# NVIDIA GPU
dotnet add package Microsoft.ML.OnnxRuntime.Gpu
# Windows (AMD/Intel/NVIDIA)
dotnet add package Microsoft.ML.OnnxRuntime.DirectML