Legend:
Library
Module
Module type
Parameter
Class
Class type
Universal Levenshtein Automata
The paper by Touzet details the Universal Levenshtein Automata.
Some nice computational properties of the (not-deterministic) automata include:
There are no epsilon transitions.
The automata are computable on demand, there is no need to store states and transitions in a data structure.
The states of the automata carry error counts.
There is a simple subsumption relation that prunes sets of states, so transitions produce small sets of states.
These allow for efficient implementation together with interesting uses.
We not only know if two strings are within a certain edit distance, but we also know what the edit distance is if they are within the edit distance limit.
If several strings are compared, we can rank them by their edit distance.
We can lazily feed characters and string fragments to the atomata and get the current error count.
This module offers a functor to build matchers for different representations of strings and characters. For a prebuilt matcher for OCaml strings and characters, see Strings.
String Abstraction
We abstract over strings and characters so that we do not rely on any specific encoding. We only need the following: