Found 17 packages
Text Similarity is a simple tool for checking the similarity between two texts. It returns the similarity score.
DBreeze database key features: - Embedded .NET family assembly, platform independent and without references to other libraries. - Multi-threaded, ACID compliant, with a solution for deadlocks resolving/elimination, parallel reads and synchronized writes/reads. - No fixed scheme for table names (construction and access on the fly). - Tables can reside in mixed locations: different folders, hard drives, memory, in-memory with disk persistence. - Liana-Trie indexing technology. Database indexes (keys) never need to be defragmented. Speed of insert/update/remove operations doesn't change during the time. - Ability to access Key/Value pair of a table by physical link, that can economize time for joining necessary data structures. - No limits for database size (except "long" size for each table and physical resources constraints). - Low memory and physical space consumption, also while random inserts and updates. Updates reside the same physical space, when possible or configured. - High performance of CRUD operations. When you need, unleash DBreeze power and get 500000 key/value pairs insert or 260K updates per second per core into sorted table on the hard drive of standard PC (benchmark in year 2012). - High speed of random keys batch inserts and updates (update mode is selectable). - Range selects / Traversing (Forward, Backward, From/To, Skip, StartsWith etc). Remove keys, change keys. - Keys and values, on the low level, are always byte arrays. - Max. key size is 65KB, max. value size is 2GB. Value can be represented as a set of columns, where can be stored data types of fixed or dynamic length. Every dynamic datablock (BLOB) can be of size 2GB. - Rich set of conversion functions from/to between byte[] and other data types. - Nested / Fractal tables which can reside inside of master tables values. - Incremental backup/restore option. - Integrated text-search subsystem (full-text/partial). - Integrated object database layer. - Fast multi-parameter search subsystem with powerful query possibilities. - Integrated Vector Database and vector similarity search engine (embedding vectors store/search). - Integrated binary and JSON serializer Biser.NET - High Availability, Redundancy and Fault Tolerance via Raft.NET - DBreeze is a foundation for complex data storage solutions (graph/neuro, object, document, text search etc. data layers). Please, study documentation to understand all abilities of DBreeze. hhblaze@gmail.com
Library that "wraps" Dandelion API: - EntityExtraction - TextSimilarity - TextClassification - LanguageDetection - SentimentAnalysis - Wikisearch
SimMetrics is a Similarity Metric Library, e.g. from edit distance's (Levenshtein, Gotoh, Jaro etc) to other metrics, (e.g Soundex, Chapman). Work provided by UK Sheffield University funded by (AKT) an IRC sponsored by EPSRC, grant number GR/N15764/01. Package includes TextFunctions.
NW.NGramTextClassification is a library to perform Text Classification on the string of text you provide. Text Classification is a machine learning technique that calculates the similarity between the string of text you need to categorize and a collection of already categorized strings you provide to the library.
.Net (C#) Binding for Babel Street Analytics API
activity allows you to search for a specified text within a list of strings and identifies the best match along with its confidence level. This activity is useful in scenarios where you need to find the most accurate match for a text input from a collection of strings, such as name matching, data validation, and text recognition tasks. By providing a confidence percentage, it ensures that you can make informed decisions based on the accuracy of the match.
Uses N-Grams to categorise text and other data. Useful for processing human-written content.
dotnet tool to find similarities b/w 2 sets of titles (Excel sheets)
A .NET port of the Apache Commons Text FuzzyScore algorithm
This library contains functions that could be useful to anyone dealing with Arabic Text. The functions we have so far: - AreEqual: where you pass two strings and some options where you can decide if you want to normalize the strings or remove diacritics before the process starts. Also you can apply text cleaning to either source, targets or both before the process starts. - FindSimilarity: this helps you to find the degree of similarity between a string and set of strings. Helpful for spelling checks.
Winklestein is a hybrid string similarity algorithm that combines Levenshtein Distance and Jaro–Winkler Similarity to provide accurate, robust, and tunable similarity scoring for short and medium-length text inputs.
Having an image and a catalog of text tags, Imagnr gives you a library to do a cognitive search based on the text appearing in the image. Uses Levenshtein distance algorithm over condensed OCR text to calculate similarity of tags.
Instantly search and replace text you select in any app — no copy-pasting into a separate editor. Drag and drop Word, PDF, or plain-text files from File Explorer to search inside them without even opening them. Four powerful search modes let you find exactly what you need: • Literal & Regex — pinpoint exact words or complex patterns. • Spelling Similarity — catch typos and near-matches so nothing slips through the cracks. • Meaning Similarity — powered by a free and local AI model, finds semantically relevant results even when the exact words differ. Highlight, navigate, replace one-by-one or all at once. Fast, private, and fully offline.
Wrapper for the text similarity metric library SimMetricsCore and several nodes covering Fuzzy Search tasks.
Library (.Net Standard 1.0) to support text and person name matching. Currently contains Levenshtein and Damerau-Levenshtein (optimal string alignment version) edit distance and normalized similarity functions optimized for speed and reduced memory consumption. There are also versions of the functions that accept a maximum desired distance or minimum desired similarity, which can result in significantly faster speeds, particularly for long strings. This is one of the faster C# implementations available (possibly the fastest for non-trivial strings). See the associated GitHub project for more detail. MIT License
A .NET Framework 4.6.1 compatible implementation of Sentence Transformers in C#. Produces embeddings using C# BERT Tokenizer and ONNX All-Mini-LM-L6-v2 model. Includes ONNX Runtime initialization helpers for legacy .NET Framework environments. Perfect for semantic search, text similarity, and embedding generation in .NET applications.