NET Standard 2.1 library to produces embeddings using C# Bert Tokenizer and Onnx All-Mini-LM-L6-v2 model.
$ dotnet add package AllMiniLmL6V2SharpC# implementation of Sentence Transformers All-MiniLM-L6-v2
Use as a .net standard 2.1 library.
Includes tokenizer and onnx model.
The Nuget does not include the onnx model or the vocab.txt. These can be found on Hugging Face (See tested models below).
The Embedder looks for the default model.onnx and vocab.txt files in the .\model folder.
You may use a custom onnx model or custom vocab as well.
var sentence = "This is an example sentence";
using var embedder = new AllMiniLmL6V2Embedder();
var embedding = embedder.GenerateEmbedding(sentence);
string[] sentences = ["This is an example sentence", "Here is another"];
using var embedder = new AllMiniLmL6V2Embedder();
var embeddings = model.GenerateEmbeddings(sentences);
var sentence = "This is an example sentence";
using var embedder = new AllMiniLmL6V2Embedder(modelPath: "path/to/model.onnx");
var embedding = embedder.GenerateEmbedding(sentence);
var sentence = "This is an example sentence";
BertTokenizer tokenizer = new("path/to/vocab.txt");
using var embedder = new AllMiniLmL6V2Embedder(tokenizer: tokenizer);
var embedding = embedder.GenerateEmbedding(sentence);
var sentence = "This is an example sentence";
ITokenizer tokenizer = new CustomTokenizer();
using var embedder = new AllMiniLmL6V2Embedder(tokenizer: tokenizer);
var embedding = embedder.GenerateEmbedding(sentence);