Language Features
odoc
works by taking module interfaces, processing them to make them more useful, and turning them into documentation. This processing is largely hiding, handling of canonical references, expansion and resolution. This document explains the features of these processes in detail.
Hiding
Some items are not intended to be used directly but are present as implementation detail, e.g., for testing, for implementing Dune's namespacing, or other reasons.
There are two mechanisms for explicitly hiding elements from the final output. The first is to use documentation stop comments, which can be used to hide any items from the output. The second mechanism is only used for modules and is a naming convention. If any module has a double underscore in its name, it’s considered to be hidden.
Canonical Items
Occasionally it’s useful to declare an item in one place and expose it in the public interface in another. In order to prevent unwanted occurrences of the actual definition site, odoc
has a feature whereby the 'canonical' location of a module, module type or type can be specified.
The biggest user of module aliases is Dune's wrapped libraries. This feature allows Dune to produce a library whose interface is exposed entirely though a single top-level module. It does this by mangling the names of the real implementation modules and generating the single top-level module that simply contains aliases to the these implementation modules. With odoc
's canonical modules feature all references to the implementation modules are rewritten to point at the aliases in the top-level module instead. Please see the section on Dune's library wrapping for more detail.
In a similar fashion, it's sometimes useful to have this feature on module types and types. For example, odoc
itself uses this for its 'paths types'. Since they are all mutually recursive, they all have to be declared at the same time in the module Paths_types
, but odoc
exposes them in separate modules. By annotating the definition point with @canonical
tags pointing to the aliases, we ensure that all their references point at the separate modules as intended.
Expansion
There are many instances of items in signatures where what's written is the best thing in terms of semantics, but it’s not necessarily useful in terms of documentation. For example:
module StringSet : Stdlib.Set.S with type t = string
odoc
will expand these items, augmenting the declaration with the signature along with all the documentation comments that can be found, e.g., those from Stdlib.Set.S
in the above example. While the compiler also does a very similar procedure and determines the signature of the module, odoc
tries quite hard to preserve as much as possible from the original items’ context.
The declaration can be seen rendered here, and the full expansion can be found by clicking on the name of the module (StringSet
in this case). The direct link is here.
These expansions have to be done carefully. A number of cases follow in which odoc
has to treat specially.
Aliases
In general odoc
doesn’t expand module aliases unless they are an alias to a hidden module. If this is the case, the right-hand side of the declaration is dropped and replaced with sig ... end
, and the expansion is created.
For example, given the following source,
module Hidden__module : sig
type t
val f : t -> t
end
module Alias = Hidden__module
the Hidden__module
module won’t be present in the output, and the Alias
module will be rendered as if it were a simple signature. This can be seen in the example rendering here.
As well as expanding aliases to hidden modules, modules are also expanded if the module alias is "self canonical." That is, if module A
is an alias to module B
, that declares A
to be the canonical path to the module (i.e., it has the tag @canonical A
in an associated comment).
Module Type Aliases
Module types don’t have aliases in the same way that modules do, but it is possible and common to declare an effective alias to another by simply creating a new module type that’s equal to a previous one. For example:
module type A = sig
type t
end
module type B = A
For these simple module type declarations, where the right-hand side is just a path, odoc
treats them as module aliases and doesn’t produce an expansion. This example is rendered here.
When strengthening, OCaml turns modules into aliases to the original module, but nothing is done to module types. In contrast, odoc
replaces module types with 'aliases' to the originals, too. These are not expanded, hence this is important for reducing the size of the output.
The following examples use module type of struct include ... end
to obtain the strengthened signature of A
(see the Module Type Of
section for more details on this).
module A : sig
module type A = sig type t end
module X : A
end
module B : module type of struct include A end
OCaml evaluates the following signature for B
:
module B : sig module type A = sig type t end module X = A.X end
whereas odoc
internally evaluates this as:
module B : sig module type A = A.A end module X = A.X end
This example is rendered here
Functors
When odoc
encounters a functor, it is also expanded. The parameters are expanded in the body of the functor expansion, above the signature representing the functor’s result.
For example, given the following,
module type Argument = sig
(** This type [a] is declared in the Argument module type *)
type a
end
module type Result = sig
(** This type [r] is declared in the Result module type *)
type r
end
module Functor : functor (X : Argument) (Y : Argument) -> Result
an expansion will be created for Functor
, containing expansions for X
and Y
within it and followed by the Result
’s signature. The above functor can be seen rendered here.
Includes
If part of your module signature comes from an include of another module or module type, odoc
keeps track of this and can render the included items in a clearly delimited and collapsible way. For example, given the following:
module type ToBeIncluded = sig
type t
val f : t -> t
(** The description of [f] *)
end
module A : sig
include ToBeIncluded
val g : t -> t
end
The expansion of module A
will contain a clearly demarcated section showing the included items.
If this behaviour is not desired, the include may be inlined with the tag @inline
as follows:
module B : sig
include ToBeIncluded
(** @inline *)
val g : t -> t
end
The expansion of module B
does not contain an indication that the elements t
and f
came from an include
directive.
Shadowing
OCaml ordinarily does not allow two items of the same type with the same name. For example, the following is illegal:
type t = int
type t = string
However, if the item comes in via an include, then OCaml allows it. For example:
module type A = sig
type t = int
val f : t
end
module type B = sig
include A
type t = string
val g : t
end
Since odoc
is required to do its own expansions, it must take account of this behaviour. The previous example is rendered here.
Deep Equations
The module type system allows for adding equations to abstract types (as seen above in the StringSet
declaration). These equations may be 'deep' in the sense that they operate on a nested module rather than the outer one. For example:
module type SIG = sig
type t
end
module type MODTYPE = sig
module X : SIG
module Y : SIG
end
type foo
module M : MODTYPE with type X.t = foo
Here we've got a module type SIG
that contains an abstract type t
and a module type MODTYPE
that contains two modules, X
and Y
, that have signature SIG
. Lastly, we declare a module M
that has signature MODTYPE
with an additional type equality X.t = foo
. When the compiler evaluates the signature of module M
here, the definition of X
within it is simply replaced with a signature:
module M : sig
module X : sig type t = foo end
module Y : SIG
end
We lose both the fact that it came from MODTYPE
and also that within it, X
originally had signature SIG
. odoc
tries to be more careful. Instead, it keeps both the MODTYPE
on M
with the type equality X.t = foo
and the SIG
on X
with the type equality t = foo
. The expansion of of module M
in this example can be seen here.
Note that if X
was a simple signature before the type equality was added, that does not get preserved. In the following example,
module type MODTYPE = sig
module X : sig type t end
module Y : sig type t end
end
type foo
module M : MODTYPE with type X.t = foo
the expansion of M does not contain any with type
equations.
Substitution
Similar to the addition of equations in the previous section, OCaml allows for types and modules to be destructively substituted, so the type or module is entirely removed from the resulting signature.
As with the addition of equations above, these substitutions may be on deeply nested modules, and care needs to be taken to ensure that there are no references to the removed module or type left. For example:
module type S = sig
module M: sig type t end
type t = M.t
end
module type T = S with type M.t := int
The expansion of T
internally is different from what is rendered. Internally, it becomes:
module M: sig type t end with type t := int
type t = M.t
From this expansion it is still clear how to resolve the right-hand side of type t = M.t
, and the next phase of odoc
's transformation turns the right-hand side of M.t
into int
.
In the output documentation, the declaration of module M
is rendered simply as
module M : sig ... end
with the type substitution dropped. This is because the type substitition on the simple signature isn't useful for the reader; the link t
would have no destination. This example is rendered here.
module type of
The OCaml construct module type of
allows the type of a module to be recovered. As usual, when OCaml performs this operation, it only retains the simplified signature, stripped of comments, includes, and more complex module type expressions. As with the previous sections, odoc
tries a little harder to keep track of these things and also of the fact that the signature came from a module type of
expression.
For example, consider the following:
module A : sig
(** This comment for [type t] is written in module [A] *)
type t
end
module M : module type of A
the type t
in module M
has the comment from the original module. There is also logic in odoc
to manage the similar construct module type of struct include ... end
, which is used where the types and modules are required to be strengthened. That is, the types in the signature are equal to those in the original module, and any modules in the new signature are aliases of those in the original. For example,
module M' : module type of struct include A end
In M’, type t
is equal to A.t
, whereas in M there is no equation.
Complications of module type of
Doing the expansion like this comes with some complications, particularly when the result is further modified. For example, consider this example:
module type S = sig
module X : sig
type t
end
module type Y = module type of X
module type Z = module type of struct include X end
end
When OCaml operates on this, it calculates the signature of S
immediately, resulting in the module type:
module type S =
sig
module X : sig type t end
module type Y = sig type t end
module type Z = sig type t = X.t end
end
whereas odoc
preserves the fact that Y
and Z
are calculated from X
. If the module X
is subsequently replaced using a destructive substitution on S
, the results would be different. Given the following,
module X1 : sig
type t
type u
end
module type T = S with module X := X1
then the signature of T
as calculated by OCaml will be
sig
module type Y = sig type t end
module type Z = sig type t = X1.t end
end
However, it's clear that if the module type of
operations were evaluated after the substitution, both Y
and Z
would contain type u
.
There is logic in odoc
to handle this correctly, but since there is currently no syntax for representing transparent ascription, the consequence is that we lose the fact that Y
and Z
originally came from module type of
expressions.
This example is rendered here, and in the expansion of T, it can be seen that Y
and Z
are simple signatures only containing type t
.
Resolution
There are several different but related constructs for referring to elements in odoc
. The following example demonstrates each:
(** The module {!module-M} and type {!module-M.module-X.type-t} *)
module M : A.B.C with type X.t = int
Here, M
is an identifier that uniquely identifies the module M
. A.B.C
is a path used to locate a particular identifier, X.t
is a fragment that refers to an element within a module type, and module-M
and module-M.module-X.type-t
are references that are similar to paths in that they are used to locate particular identifiers; however, unlike paths, they are not checked by the compiler and are entirely resolved by odoc
.
In most of the output formats, odoc
supports paths; references and fragments will be turned into links that take the reader to the referred identifier. These links need to consider some of the expansions’ features, as outlined above. In order to decide where the links should point to and how to turn them into text, a process called 'Resolution' is required.
Aliases
Since aliases are not usually expanded, a path or reference to an item contained in an aliased module must link directly to the item inside the aliased module. For example:
module A : sig
type t
end
module B = A
type t = B.t
The right-hand side of type t
should render as B.t
, but it should link to the definition of t
in module A
. This example is demonstrated here.
Aliases of hidden modules are expanded, so the following example demonstrates this alteration:
(**/**)
module A : sig
type t
end
(**/**)
module B = A
type t = B.t
Here we've hidden A
via the documentation stop comment mechanism. This example is demonstrated here.
Canonical Paths
When encountering a module, module type, or a type that has been marked with a @canonical
tag, odoc
first has to check that the specified canonical path actually resolves. If this is the case, in a similar way to the alias above, the target of the path will be rewritten to point to the canonical path. However, in contrast to the alias behaviour, the text of the path will also be rewritten, so it will be as if the canonical path had been written instead of whatever path was actually there.
For example:
module A : sig
(** @canonical Odoc_examples.Resolution.Canonical.B *)
type t
end
module B = A
type t = A.t
Note that in this example the @canonical
tag has been given the path Odoc_examples.Resolution.Canonical.B
. This must be the fully qualified path to the canonical item.
The right-hand side of type t
will be rewritten such that it will be as if B.t
had been written instead.
Note that canonical tags are only used when resolving paths, not fragments (which are relative anyway) nor references. Since they are written by the author, they’re assumed to point to the correct destination.
Fragment Resolution
Fragments are relative paths that appear in module type expressions when adding equations or substituting types or modules. For example:
module type A = sig
module B : sig
type t
val f : t -> t
end
end
module C : A with type B.t = int
module D : module type of C.B with type t := int
In this expression, the fragment B.t
should link to the definition of type t
inside module B
inside module type A
. The fragment t
in the definition of module D
should link to the definition of type t
inside module B
inside module C
. Note that it can't link to type t
in D
since that type has been destroyed!
This example is rendered here.
Hidden Elements
If there are paths that refer to hidden elements, these are removed from the interface unless there is an equal non-hidden type that can replace it. For example, in the following type definitions,
(**/**)
type t = int
type u
(**/**)
type v = T of t
type w = U of u
type v
will have a right-hand side of T of int
, as the hidden type t
is equal to int
; whereas, there is no non-hidden type equivalent to u
, so the right-hand side of type w
is omitted from the output.
Reference Resolution
References are hand-written in comments and not evaluated in any way by the compiler.
module type A = sig
type t
(** type [t] in module type [A] *)
end
module A : sig
type t
(** type [t] in module [A] *)
module B : sig type t end
module type B = sig type t end
end
(** We can refer unambiguously to {!module-type-A.t} in module type [A]
or {!module-A.t} in module [A], and also where there are name clashes
within the path: {!A.module-B.t} or {!A.module-type-B.t} *)
This demonstrates that it’s possible to refer to elements even when there’s ambiguity, if just the names were used. If odoc
detects any ambiguity, it will emit a warning.
Module Type Challenges
In some cases, resolution can be more challenging than others. Consider this example:
module type A = sig
module M : sig module type S end
module N : M.S
end
module B : sig module type S = sig type t end end
module C : A with module M = B with type N.t = int
type t = C.N.t
In the expansion of module type A
, module N
has no expansion because module type S
is abstract. Therefore, in the definition of module C
, the fragment N.t
cannot link to module N
in module type A
, but instead it must link to module type S
in module B
.
This example is rendered here.
Now for a very complicated example:
module type Type = sig module type T end
module App : functor (T : Type) (F : Type -> Type) (M : F(T).T) -> F(T).T
module Bar : sig module type T = sig type bar end end
module Foo :
functor (T : Type) -> sig module type T = sig module Foo : T.T end end
module FooBarInt : sig module Foo : sig type bar = int end end
type t = App(Bar)(Foo)(FooBarInt).Foo.bar
This one is left as an exercise to the reader! It can be seen rendered by odoc
here.