package uuseg

  1. Overview
  2. Docs
Unicode text segmentation for OCaml

Install

Dune Dependency

Authors

Maintainers

Sources

uuseg-10.0.0.tbz
sha256=8ce9985d005135c939946b87ae24d1ad0fe46b5909115d4f1fa79df8a4a40807
md5=834be4e6d55e5571352bb4a2bfee9905

Description

Uuseg is an OCaml library for segmenting Unicode text. It implements the locale independent Unicode text segmentation algorithms to detect grapheme cluster, word and sentence boundaries and the Unicode line breaking algorithm to detect line break opportunities.

The library is independent from any IO mechanism or Unicode text data structure and it can process text without a complete in-memory representation.

Uuseg depends on Uucp and optionally on Uutf for support on OCaml UTF-X encoded strings. It is distributed under the ISC license.

README

Uuseg — Unicode text segmentation for OCaml

v10.0.0

Uuseg is an OCaml library for segmenting Unicode text. It implements the locale independent Unicode text segmentation algorithms to detect grapheme cluster, word and sentence boundaries and the Unicode line breaking algorithm to detect line break opportunities.

The library is independent from any IO mechanism or Unicode text data structure and it can process text without a complete in-memory representation.

Uuseg depends on Uucp and optionally on Uutf for support on OCaml UTF-X encoded strings. It is distributed under the ISC license.

Homepage: http://erratique.ch/software/uuseg
Contact: Daniel Bünzli <daniel.buenzl i@erratique.ch>

Installation

Uuseg can be installed with opam:

opam install uuseg
opam install uutf uuseg # for support on OCaml UTF-X encoded strings

If you don't use opam consult the opam file for build instructions.

Documentation

The documentation and API reference is automatically generated by ocamldoc from the interfaces. It can be consulted online or via odig doc uuseg.

Sample programs

If you installed Uuseg with opam sample programs are located in the directory opam config var uuseg:doc.

In the distribution sample programs are located in the test directory of the distribution, they can be built with:

topkg build --tests true

  • test.native tests the library, nothing should fail.

  • usegtrip.native inputs Unicode text on stdin and rewrites segments on stdout. Invoke with --help for more information Depends on Uutf and Cmdliner.

Dependencies (6)

  1. uucp >= "10.0.0" & < "11.0.0"
  2. uchar
  3. topkg build
  4. ocamlbuild build
  5. ocamlfind build
  6. ocaml >= "4.01.0"

Dev Dependencies

None

Used by (8)

  1. fuzzy_compare
  2. inquire = "0.2.1"
  3. notty < "0.2.3"
  4. ocamlformat >= "0.10" & < "0.25.1"
  5. ocamlformat-lib
  6. ocamlformat-rpc < "0.21.0"
  7. slug
  8. zed >= "3.2.0"

Conflicts (1)

  1. uutf < "1.0.0"
OCaml

Innovation. Community. Security.