CLIP-based image embedding generation using ONNX Runtime for local multimodal search and RAG scenarios
$ dotnet add package ElBruno.LocalEmbeddings.ImageEmbeddingsA .NET library for generating text embeddings locally using ONNX Runtime and Microsoft.Extensions.AI abstractions — no external API calls required.
New to local embeddings? Watch this 5-minute video explaining the main goal of the library.
Want to build RAG applications? Read this blog post about 3 RAG approaches in .NET with local embeddings and zero cloud calls.
Interested in image embeddings? Check out the YouTube video and blog post about local image embeddings with CLIP and ONNX.
IEmbeddingGenerator<string, Embedding<float>>ElBruno.LocalEmbeddings.KernelMemory provides a native ITextEmbeddingGenerator adapter for Microsoft Kernel MemoryElBruno.LocalEmbeddings.VectorData adds DI helpers for Microsoft.Extensions.VectorData (VectorStore and typed collections)ElBruno.LocalEmbeddings.VectorData includes InMemoryVectorStore (no Semantic Kernel connector dependency required)IServiceCollection integrationGenerateAsync("text") and GenerateEmbeddingAsync("text") — no array wrapping neededSimilarity(...) matrix, and one-line FindClosestAsync(...) semantic searchdotnet add package ElBruno.LocalEmbeddings
For Kernel Memory integration, also install the companion package:
dotnet add package ElBruno.LocalEmbeddings.KernelMemory
For VectorData integration, install:
dotnet add package ElBruno.LocalEmbeddings.VectorData
using ElBruno.LocalEmbeddings;
await using var generator = await LocalEmbeddingGenerator.CreateAsync();
var embedding = await generator.GenerateEmbeddingAsync("Hello, world!");
Console.WriteLine(embedding.Vector.Length); // 384
var inputs = new[] { "first text", "second text", "third text" };
var embeddings = await generator.GenerateAsync(inputs);
Console.WriteLine(embeddings.Count); // 3
using ElBruno.LocalEmbeddings.Extensions;
var pair = await generator.GenerateAsync(["I love coding", "I enjoy programming"]);
var score = pair[0].CosineSimilarity(pair[1]);
Console.WriteLine(score);
var corpus = new[]
{
"Python for data science",
"JavaScript for web apps",
"Swift for iOS development"
};
var corpusEmbeddings = await generator.GenerateAsync(corpus);
var results = await generator.FindClosestAsync(
"best language for websites",
corpus,
corpusEmbeddings,
topK: 2,
minScore: 0.2f);
foreach (var result in results)
Console.WriteLine($"{result.Score:F3} - {result.Text}");
For custom models and runtime behavior, use the options-based constructor:
new LocalEmbeddingGenerator(new LocalEmbeddingsOptions { ... }).
Note: The synchronous constructor remains available for backward compatibility, but performs blocking initialization when downloads are needed.
Want to go further? Read the Getting Started guide and the other docs in this repo for DI, configuration, VectorData, Kernel Memory, and full RAG examples.
Prefer a containerized dev environment? See the Dev Container section in the Contributing guide.
See the samples README for prerequisites and run instructions.
| Sample | What It Shows |
|---|---|
| HelloWorldAltModel | Minimal hello world with sentence-transformers/all-MiniLM-L12-v2 |
| ConsoleApp | All the basics: single/batch embeddings, similarity, semantic search, DI |
| RagChat | Embedding-only semantic search Q&A using shared VectorData InMemoryVectorStore (no LLM needed) |
| RagOllama | Full RAG with Ollama + phi4-mini + Kernel Memory |
| RagFoundryLocal | Full RAG with Foundry Local + phi4-mini |
| ImageRagSimple | Minimal image RAG: index images → search by text |
| ImageRagChat | Interactive image RAG chat with text and image-to-image search |
var options = new LocalEmbeddingsOptions
{
ModelName = "sentence-transformers/all-MiniLM-L6-v2", // HuggingFace model
MaxSequenceLength = 512, // Max tokens
CacheDirectory = null, // Auto-detect per platform
EnsureModelDownloaded = true, // Download if missing
NormalizeEmbeddings = false // L2 normalize vectors
};
See Configuration docs for supported models, local model paths, and cache locations.
Estimated download sizes below are approximate and can vary by ONNX variant (fp32/int8) and tokenizer assets.
sentence-transformers/all-MiniLM-L6-v2 (default, ~90–100 MB)sentence-transformers/all-MiniLM-L12-v2 (~130–140 MB)sentence-transformers/paraphrase-MiniLM-L6-v2 (~90–100 MB)BAAI/bge-large-en-v1.5 (large, ~1.3 GB)intfloat/e5-large-v2 (large, ~1.3 GB)| Topic | Description |
|---|---|
| Getting Started | Step-by-step guide from hello world to RAG |
| API Reference | Classes, methods, and extension methods |
| Configuration | Options, supported models, cache locations |
| Alternative Models | Non-default free models, local download workflow, and license notes |
| Dependency Injection | All DI overloads and IConfiguration binding |
| Kernel Memory Integration | Using local embeddings with Microsoft Kernel Memory |
| VectorData Integration | Using local embeddings with Microsoft.Extensions.VectorData abstractions |
| Contributing | Build from source, repo structure, guidelines |
| Roadmap | Planned and completed features/samples with priorities |
| Publishing | NuGet publishing with GitHub Actions + Trusted Publishing |
| Changelog | Versioned summary of notable changes |
Have an idea for a new feature or sample? Please open an issue and share your suggestion.
git clone https://github.com/elbruno/elbruno.localembeddings.git
cd elbruno.localembeddings
dotnet build
dotnet test
Hi! I'm ElBruno 🧡, a passionate developer and content creator exploring AI, .NET, and modern development practices.
Made with ❤️ by ElBruno
If you like this project, consider following my work across platforms:
This project is licensed under the MIT License — see the LICENSE file for details.