package radis
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=d16b9fa66a922960fd25d8c80510ddb6264b11baab9ff966f41542111f80b93d
md5=1a8b674f1159f67acb8d03fa46c5d5e7
Description
Radis is a little library to provide an implementation of a Radix tree (specialized with a scalar key - like a string, bigarray or array). This project is a part of ocaml-git to have a fast access to an immutable store. In this way, remove operation should be the slowest operation and this data-structure is focused on lookup operation when the key is specifically a hash (see digestif).
The idea behind this library is to provide a fast access to an immutable store
(like git). So, in my mind, if we put a new object in this store, it can not be
deleted. It's why remove operation should be slow - of course, in a git store,
it's possible to delete an useless object (see git gc
), however this
computation does not appear in this way.
Published: 14 May 2018
README
Radis
Radis is a little library to provide an implementation of a Radix tree (specialized with a scalar key - like a string, bigarray or array). This project is a part of ocaml-git to have a fast access to an immutable store. In this way, remove operation should be the slowest operation and this data-structure is focused on lookup operation when the key is specifically a hash (see digestif).
The idea behind this library is to provide a fast access to an immutable store (like git). So, in my mind, if we put a new object in this store, it can not be deleted. It's why remove operation should be slow - of course, in a git store, it's possible to delete an useless object (see git gc
), however this computation does not appear in this way.
Examples
This library is close to provide a Map like what stdlib provide: Map.S
since ocaml 4.06.0.
Benchmarks & Performance
From the initial purpose, the lookup
operation must be fast. So we did a benchmark between Radis, Map (from stdlib) and Hashtbl. About the benchmark, we took a git repository (ocaml-git), do a git rev-list --objects --all
and took an arbitrary number of hashes binded with an empty string or a path.
Then, from this associative list, we make a Radis tree, a Map and a Hashtbl. Finally, we try to lookup all elements from the initial list in these data-structure and look how long we need to access (lookup
operation) to a value.
This benchmark was run on my computer (Thinkpad X1 Carbon - Intel i7-7500U CPU @ 2.70 Ghz - 2.90 Ghz). You can run benchmarks with ./macro -p radis
. If you catch an exception, it's because you don't have enough git objects on your current directory. Results are available on benchmarks.txt
.
We implemented a benchmark with Core_bench
(available by jbuilder build bench/micro.exe
). Results are the same than Benchmarks
(macro
):
Map
from stdlib is slower thanRadis
when we have more than 100 elementsHashtbl
is faster
This is expected goal and could be improve for next release.
Inspirations & License
This library, as I said on the header, is a mix between ocaml-stringset and ocaml-aliases. The license story and improvements on each library (and abondonment of certain) is not clear to decide which license we should use.
So, I continue to use LGPL 2.1 (as mentionned on ocaml-aliases) and find the path-inspiration (available on AUTHORS.md):
Jean-Christophe Filliatre (Patricia tree)
Ludo (@khigia) (extend to use a string as a key)
Hugo Heuzard (bind to a list of values)
Romain Calascibetta (computation of the critical bit - as the first improvement)