The obvious use of dependent typing is to describe invariants of your program, for example, “a non-NULL pointer”, “an array of length 5”, or “a non-empty list”. While full-blown dependent typing requires a full-blown theorem prover (such as Coq or ATS/LF), many mainstream languages support basic dependent types. Java can infer whether a value is not null, OCaml can talk about types which are made up of only certain constructors, Mozilla’s new language Rust can attach arbitrary predicates to types, and I would be remiss not to mention Mercury.
Mercury has limited but useful support for dependent typing via its mode system. The mode system appears complicated, so let’s start with a brief overview.
The mode system
Each predicate (or function) in Mercury does something to its arguments. The most common possibilities are to look at an argument and leave it be (this would be an input argument), or to attempt to bind it to a value (this would be an output argument). Another possibility is that it destroys its argument after looking at it (this happens with the IO state). These possible actions are called
Modes are described simply by the expected state of the argument before the call, and the guaranteed state of the argument after the call using the syntax
Before >> After
. These states (called free
(the argument is unbound), ground
or bound
(the argument is bound to a value), clobbered
(the argument is destroyed or overwritten), and unique
(the argument is bound to a value which is not referenced anywhere else). The latter two are used mainly for IO state and arrays, so we will focus on free
and bound
.The modes we are all familiar with,
in
and out
, are defined in the module builtin
as ground >> ground
and free >> ground
, respectively. In other words, an in
argument is expected to be bound to a value before the call and is left that way after, and an out
argument may be unbound before the call, but will be bound after.The magic is that we may tell Mercury that we expect or guarantee an argument to be bound only to certain constructors! To do this, we must first define the instantiatedness in which we are interested, say non-empty lists:
:- inst non_empty_list == bound([ground | ground]).This defines the
non_empty_list
inst as a variable bound to a list cons cell, the head and tail of which are both bound. We could similarly define the inst of an empty list as::- inst empty_list == bound([]).In other words, an
empty_list
is a variable which is bound to the empty list constructor.Note that there is also a shorthand which allows the above two insts to be written similarly to type definitions:
:- inst non_empty_list ---> [ground | ground]. :- inst empty_list ---> [].We could then define the mode of an input argument which is expected to be a non-empty list as:
:- mode in_non_empty_list == non_empty_list >> non_empty_list.In other words, an argument with this mode would be expected to be bound to a non-empty list, and would remain bound that way afterward. Similarly:
:- mode out_non_empty_list == free >> non_empty_list.An argument with this mode may be unbound before the call, but is guaranteed to be bound to a non-empty list after. Mercury actually has the mode-functions
in()
and out()
already defined, so we may write in(non_empty_list)
or out(non_empty_list)
to describe these same modes.Example
Note: due to bug 191, the following example won’t trigger the proper error messages from the compiler as of this post.
out
modes with out(list(ground))
, which won’t change the meaning but will avoid triggering the bug. UPDATE: parts of the standard library won’t compile without the bug, so don’t go applying that patch just yet.On a recent thread at Hacker News, a question was brought up about proving that
[X | Xs] = reverse([Y | Ys])
was total (or in Mercury terms, det
). Let’s see if we can do this in Mercury using our new tools. First let’s define reverse2
(named so not to conflict with built-in list.reverse
) using a recursive auxiliary function with an accumulator::- import_module list. :- func reverse2(list(A)) = list(A). :- mode reverse2(in) = out. reverse2(A) = reverse2_aux(A, []). :- func reverse2_aux(list(A), list(A)) = list(A). :- mode reverse2_aux(in, in) = out. reverse2_aux([], Acc) = Acc. reverse2_aux([A | As], Acc) = reverse2_aux(As, [A | Acc]).(I have included the mode declarations as we will be adding to them later.) Testing this function in our
main
predicate is simple:main(!IO) :- Y = 1, Ys = [2,3,4,5], [X | Xs] = reverse2([Y | Ys]), print({X, Xs}, !IO), nl(!IO).A sharp eye will spot the problem before we even compile: the deconstruction
[X | Xs] =
can fail, which would cause main
to be semidet
instead of det
(that is, a partial function). The compiler spots this as well:In `main'(di, uo): error: determinism declaration not satisfied. Declared `det', inferred `semidet'. Unification with `list.[X | Xs]' can fail.Of course this isn’t true. What the compiler would love to see is a mode declaration stating that if
reverse2
is passed a non-empty list, then it will return a non-empty list. Let’s write that::- mode reverse2(in(non_empty_list)) = out(non_empty_list).Now we get a different error:
In clause for `reverse2(in((list.non_empty_list))) = out((list.non_empty_list))': mode error: argument 2 did not get sufficiently instantiated. Final instantiatedness of `HeadVar__2' was `ground', expected final instantiatedness was `bound(list.'[|]'(ground, ground))'.(Note: if you didn’t/can’t apply the above compiler patch, as a workaround, replace
out
in the original mode declarations with out(list(ground))
.)Here,
HeadVar__2
refers to the function result (what would be the second variable in the head were reverse2
a predicate). The compiler expected the function result to be non-empty (bound to list.[|]
) but it was bound to an unknown value of type list. Of course this is because reverse2_aux
doesn't guarantee a non-empty result on non-empty input, so let’s add a declaration stating so as well::- mode reverse2_aux(in(non_empty_list), in) = out(non_empty_list).(Note that we don’t say anything about the accumulator in this declaration.) Unfortunately recompiling gives us a similar error, this time complaining about the second (the recursive) clause of
reverse2_aux
. Of course – if our initial list had only one item, the recursive call would pass reverse2_aux
an empty list, triggering the first clause, and we haven’t made (and can’t make) any special guarantees about that case. Let's take a closer look at that clause:reverse2_aux([], Acc) = Acc.The only way that this clause will return a non-empty list is if the accumulator,
Acc
, is non-empty. Fortunately, it is called recursively with a non-empty accumulator (since elements are prepended on every recurrence), so this should be enough to convince the compiler. Let’s add a mode declaration stating this fact::- mode reverse2_aux(in, in(non_empty_list)) = out(non_empty_list).Success! Compilation is successful, meaning
main
has been proved to be det
(a total function), and running the program gives the expected output:{5, [4, 3, 2, 1]}Using Mercury’s mode system, we were able to prove that reversing a non-empty list results in a non-empty list, by adding descriptive mode declarations in strategic locations.
As an exercise, add an inst describing a list of at least two elements and add modes to
reverse2
and reverse2_aux
which make guarantees about lists of at least two elements.Download reverse.m