2 packages tagged with “DAWG”
DAWG (Directed Acyclic Word Graph) is a data structure for storing and searching large word lists while keeping your memory footprint small and lookups fast. DawgSharp is an open-source C# implementation featuring a linear time graph reduction algorithm and out-of-the-box persistence support. The Dawg class is nearly as fast as a HashSet for lookups and is much, much more memory-efficient (factors of 30x - 40x are not uncommon). In a benchmark application it held two million words while consuming only 2Mbytes of RAM. That's only one byte per word! And it's even less on disk. The Dawg class can be thought of as a read-only Dictionary<string, Value>: it has a ["string"] indexer and implements IEnumerable<KeyValuePair<string, Value>>. One other very useful feature of Dawg (not found in Dictionary) is the ability to quickly find all words that start with a particular prefix: dawg.MatchPrefix("star") could possibly yield "star", "starch", "start", "starting", etc. This package is provided under the terms of the GNU GPL v3. Source code and documentation are available on GitHub: https://github.com/bzaar/DawgSharp. Commercial licenses are also available at http://morpher.co.uk
A .NET library providing a collection of high-performance, memory-efficient data structures, primarily focused on various trie implementations and advanced search capabilities. Key Features: - Mutable and Immutable Trie Dictionaries: Includes a mutable Double-Array trie and read-only, highly compressed DAWG and LOUDS tries. - Advanced Search: Supports exact match, prefix, common prefix, longest prefix, and wildcard searches (LTR/RTL). - Flexible Record Storage: Each key maps to a list of records, with support for both persistent (serializable) and transient (in-memory) storage per key. - A rich set of underlying data stores, including HashMap, AVL Tree, AA Tree, Treap, and more. Ideal for applications requiring efficient text processing, auto-completion, full-text search engines, morphological analysis, or management of large, static dictionaries. For more details, please visit the project repository on GitHub.