package art
Install
Dune Dependency
Authors
Maintainers
Sources
sha256=3b0690b13902d30c857bb7edf4491731c374c7504a999e6fabed9228504de9bb
sha512=b3f96eb07fb50ed014483dabe4b364c195a9e7bd2641225cb599d783c096c9d8be036966c33e5eb0c1d9c193328f4163292d460481c258aa1c5d512ccc713690
Description
Implementation of an Adaptive Radix Tree in OCaml. A fast hash-table like structure plus the order.
Published: 20 Jul 2022
README
Adaptive Radix Tree (ART) in OCaml
This is an implementation in OCaml of ART. Adaptive Radix Tree is like a simple Hashtbl
with order:
# let tree = Art.make () ;;
# Art.insert tree (Art.key "foo") 42 ;;
# Art.insert tree (Art.key "bar") 21 ;;
# Art.find tree (Art.key "foo")
- : int = 42
Operation like minimum
or maximum
are available (which don't exist for a simple Hashtbl.t
):
# let tree = Art.make () ;;
# Art.insert tree (Art.key "0") 0 ;;
# Art.insert tree (Art.key "1") 1 ;;
# Art.insert tree (Art.key "2") 2 ;;
# Art.minimum tree
- : int = 0
# Art.maximum tree
- : int = 2
If you want the order and the speed of Hashtbl.t
, Art is your library:
Read Optimised Write Exclusion (ROWEX) in OCaml
ROWEX is a second implementation of ART with atomic operations. It's a functor which expects an implementation of atomic operations such as load
or store
.
Parallelism, atomic operation & OCaml
The current version of OCaml has a global lock for the GC. By this way, it's not possible for us to execute ROWEX operations (find
/insert
) with true parallelism if we use the same OCaml runtime. Even if you use LWT or ASYNC, you execute jobs concurrently.
However, ROWEX wants to provide an implementation where find
/insert
can be executed in parallel without any problems (race condition or ABA problem). So ROWEX provides an implementation, persistent
, which implements atomic operations on a memory area. Then, we are able, as parmap
, to simulate true parallelism as long as each operations are executed into their own fork()
.
The goal of this library is provide:
the most easy way to switch the implementation to ocaml-multicore
a baby step to be able to manipulate a file by several processes (consumers/
find
, producers/insert
) in parallel
ROWEX follows two main papers:
Tools
The distribution comes with some tools to manipulate an index:
$ opam pin add -y https://github.com/dinosaure/art
$ opam install rowex
$ part.make index.idx
$ ls -lh
-rw-r--r-- 1 user user 8,0M ----- -- --:-- index.idx
prw------- 1 user user 0 ----- -- --:-- index.idx.socket
prw------- 1 user user 0 ----- -- --:-- index.idx-truncate.socket
$ part.insert index.idx foo 1
$ part.find index.idx foo
1
On the OCaml side, a Part
module exists which implements these functions:
type 'a t constraint 'a = [< `Rd | `Wr ]
val create : ?len:int -> string -> unit
val insert : [> `Rd | `Wr ] t -> string -> int -> unit
val lookup : [> `Rd ] t -> string -> int
part
is Unix dependent (and it need an Unix named pipe). It ensures with explained internal mechanisms to use multiple readers and one writer:
The writer can take the exclusive ownership on the index file and its named pipe
readers don't need to take the ownership but they must send a signal into the named pipe (to the writer) that they start to introspect the index
For readers, some functions exist to signal their existence to the write:
val append_reader : Ipc.t -> unit
val delete_reader : Ipc.t -> unit
val ipc : _ t -> Ipc.t
Status: experimental
This part of the distribution is experimental - even if the distribution comes with several tests to ensure that the implementation works, ROWEX is fragile! It still need a synchronization mechanism fsync()
which is added pervasively in some parts of the code according to outcomes of errors.
Used by (3)
Conflicts
None