tag:blogger.com,1999:blog-15606351607139802372024-03-13T22:48:32.575-07:00Adventures in MercuryChris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.comBlogger13125tag:blogger.com,1999:blog-1560635160713980237.post-57163172086094405002011-12-05T12:15:00.000-08:002011-12-05T12:15:15.255-08:00Checkers AI, part 2Today I’d like to wrap up the game logic of our checkers-playing AI. <a href="http://adventuresinmercury.blogspot.com/2011/11/checkers-ai-part-1.html">Last time</a> we defined the players, pieces, and the board, and operations on them. Today let’s write the game-playing code.<br />
<br />
<h3><code>moves</code> module</h3>The <code>moves</code> module simply links the <code>position</code> module to the <code>board</code> module, moving pieces around a <code>board</code> in accordance with the patterns defined in <code>position</code>.<br />
<br />
Just at <code>position</code> defines two movement patterns (<code>move</code> and <code>jump</code>), we will define two predicates to update a <code>board</code> based on these patterns: <code>do_move</code> and <code>do_jump</code>.<br />
<pre><code>:- pred do_move(color, position, position, board, board).
:- mode do_move(in, in, in, in, out) is semidet.
:- mode do_move(in, in, out, in, out) is nondet.
:- mode do_move(in, out, out, in, out) is nondet.
:- pred do_jump(color, position, position, board, board).
:- mode do_jump(in, in, in, in, out) is semidet.
:- mode do_jump(in, in, out, in, out) is nondet.
:- mode do_jump(in, out, out, in, out) is nondet.</code></pre>The three modes defined for each predicate allow us to check if a given move or jump is valid for the given player, to enumerate the moves or jumps available to a piece at a given position, and to enumerate all moves and jumps available on the board, respectively. Implementation is straightforward:<br />
<pre><code>:- import_module piece.
do_move(Color, From, To, !Board) :-
move(Piece, From, To),
move_piece(From, To, Piece, !Board),
color(Piece) = Color.
do_jump(Color, From, To, !Board) :-
jump(Piece, From, Jumped, To),
move_piece(From, To, Piece, !Board),
color(Piece) = Color,
remove_piece(Jumped, Other, !Board),
opponent(color(Piece), color(Other)).</code></pre>Since in checkers, a piece must jump if it can jump, we will later find it useful to check whether a piece can jump from a given position without the overhead of actually updating the board. Checking whether a piece can move will also be useful for determining the end of game, so let’s define both <code>can_move</code> and <code>can_jump</code>:<br />
<pre><code>:- pred can_move(color, position, board).
:- mode can_move(in, in, in) is semidet.
:- mode can_move(in, out, in) is nondet.
can_move(Color, From, Board) :-
move(Piece, From, To),
piece_at(Board, From, Piece),
no_piece_at(Board, To),
color(Piece) = Color.
:- pred can_jump(color, position, board).
:- mode can_jump(in, in, in) is semidet.
:- mode can_jump(in, out, in) is nondet.
can_jump(Color, From, Board) :-
jump(Piece, From, Jumped, To),
piece_at(Board, From, Piece),
no_piece_at(Board, To),
color(Piece) = Color,
piece_at(Board, Jumped, Other),
opponent(color(Piece), color(Other)).</code></pre>Note that we could simply have defined <code>can_move(Color, From, Board) :- do_move(Color, From, _, Board, _).</code> (and similarly for <code>can_jump</code>, but, since Mercury neither is lazy nor tracks unused arguments, this would entail the unnecessary computational overhead of updating the board.<br />
<br />
<h3><code>game</code> module</h3>The <code>game</code> module will house the state of the game, and logic for generating possible successor states based on the rules of the game. The state of the game is given by whose turn it is, whether they are in the middle of a jump or not (in checkers, after jumping a piece, you may immediately jump another with the same piece), the history of moves (to check for draws) and the state of the board:<br />
<pre><code>:- import_module board, color, position.
:- type game --->
game(turn::color,
jumping_state::jumping,
history::history,
board::board).
:- type jumping_state ---> not_jumping; jumping(position).</code></pre>In <code>history</code>, we need only remember past board states for each player. (This is because, as we’ll see later, we don’t need to track history of jumps.)<br />
<pre><code>:- import_module set.
:- type history == set({color, board}).</code></pre><br />
<h4>Initialization of the game</h4>Initializing the game state is easy. Red goes first and is not in the middle of a jump:<br />
<pre><code>:- func init(int, int) = game.
init(Width, Height) = game(red, not_jumping, init, init(Width, Height)).</code></pre>Checkers is played on an 8×8 board by default, so let’s provide initializers for this case as well:<br />
<pre><code>:- func init = game.
init = init(8, 8).
:- pred init(game::out) is det.
init(init).</code></pre>(The <code>init</code> predicate is redundant but some programmers prefer that style over zero-argument functions so we’ll provide it.)<br />
<br />
<h4>Rules of the game</h4>The change from one game state to the next is uniquely identified by the starting and ending locations of some piece on the board. We can declare a predicate which updates the game state based on this information, or generates possible new game states, as follows:<br />
<pre><code>:- pred step(position, position, game, game).
:- mode step(in, in, in, out) is semidet.
:- mode step(in, out, in, out) is nondet.
:- mode step(out, out, in, out) is nondet.</code></pre>The definition of <code>step</code> itself is slightly hairy, as it must follow the rules of checkers. I’ve broken it into two cases, corresponding to when a player begins their move (<code>jumping_state</code> is <code>not_jumping</code>), and when a player is forced to continue a move because a further jump is available (<code>jumping_state</code> is <code>jumping</code>):<br />
<pre><code>step(From, To, game(Color0, not_jumping, !.History, !.Board) @ Game0,
game(Color, Jumping, !:History, !:Board)) :-
% The game can only proceed if it is not a draw.
not is_draw(Game0),
% We must jump if it is possible.
(can_jump(Color0, _, !.Board) ->
% Since are removing a piece, history doesn't matter any more.
init(!:History),
% Update the board.
do_jump(Color0, From, To, !Board),
% If more jumps are possible, remember this fact.
(can_jump(Color0, To, !.Board) ->
Color = Color0,
Jumping = jumping(To)
;
opponent(Color, Color0),
Jumping = not_jumping
)
;
% Add the last position to our history.
insert({Color0, !.Board}, !History),
% Update the board.
do_move(Color0, From, To, !Board),
% It is now our opponent’s turn.
opponent(Color0, Color),
Jumping = not_jumping
),
% If our move or jump ended in the king’s row, king it.
(in_kings_row(!.Board, Color0, To), king_piece(To, !Board) -> true; true).
step(From, To, game(Color0, jumping(Pos), _, !.Board),
game(Color, Jumping, init, !:Board)) :-
% History is ignored while we are jumping.
% We can only continue a jump from where we left off.
From = Pos,
% Update the board.
do_jump(Color0, From, To, !Board),
% If more jumps are possible, remember this fact.
(can_jump(Color0, To, !.Board) ->
Color = Color0,
Jumping = jumping(To)
;
opponent(Color, Color0),
Jumping = not_jumping
),
% If our jump ended in the king’s row, king it.
(in_kings_row(!.Board, Color0, To), king_piece(To, !Board) -> true; true).</code></pre>(<code>is_draw</code> is defined below.) Phew! That, fortunately, is the single most complex chunk of code in the entire implementation.<br />
<br />
<h4>Checking the state of the game</h4>It is important to provide predicates to determine whether a game has finished, and if so, who (if anyone) has won. First, we define <code>can_step</code> as a proxy for <code>step</code> which does not actually create a new game state:<br />
<pre><code>:- pred can_step(game::in) is semidet.
can_step(game(Color, not_jumping, _, Board) @ Game) :-
not is_draw(Game),
(can_jump(Color, _, Board); can_move(Color, _, Board)).
can_step(game(_, jumping(_), _, _)).</code></pre>(Again, as with <code>can_move</code> and <code>can_jump</code>, we could have defined this in terms of <code>step</code>, but we would perform unnecessary computation.)<br />
<br />
We can then define <code>is_draw</code> to determine whether the current game state is a draw or not (i.e. whether the current state exists in previous history):<br />
<pre><code>:- pred is_draw(game::in) is semidet.
is_draw(game(Color, not_jumping, History, Board)) :-
member({Color, Board}, History).</code></pre>Finally, we can determine whether a game is over, and if so, who the winner is (if any) as follows:<br />
<pre><code>:- pred game_over(game::in, maybe(color)::out) is semidet.
game_over(Game, Winner) :-
if is_draw(Game) then Winner = no
else
not can_step(Game),
Winner = yes(opponent(Game^turn)).</code></pre><br />
<h3><code>game_io</code> module</h3>We’re almost ready to test our game logic with human players. To do this, we’ll first need to declare predicates to print the state of the game, and to allow the human player(s) to choose a move:<br />
<pre><code>:- import_module board, color, game, piece.
:- import_module io, maybe.
:- pred print_color(color::in, io::di, io::uo) is det.
:- pred print_piece(piece::in, io::di, io::uo) is det.
:- pred print_board(board::in, io::di, io::uo) is det.
:- pred print_game(game::in, io::di, io::uo) is det.
:- pred print_winner(maybe(color)::in, io::di, io::uo) is det.
:- pred choose_move(list(game)::in(non_empty_list), result(game)::out,
io::di, io::uo) is det.</code></pre>These predicates are all fairly straightforward, so I’ll elide most of them here to save space. Of note is <code>choose_move</code>, which presents each game state in the list in turn to the player, asking if the player wishes to choose that move. If the player chooses none, the sequence is repeated. Note that the mode of this argument is given as <code>in(non_empty_list)</code> to ensure that an empty list of moves is not passed, which would result in an infinite loop. Here is the code, which is mostly implemented by <code>choose_move_aux</code>:<br />
<pre><code>choose_move(Games, Chosen, !IO) :- choose_move_aux(Games, Games, Chosen, !IO).
:- pred choose_move_aux(list(game)::in(non_empty_list), list(game)::in,
result(game)::out, io::di, io::uo) is det.
choose_move_aux(AllGames, [], Chosen, !IO) :-
write_string("Repeating...\n\n", !IO),
choose_move_aux(AllGames, AllGames, Chosen, !IO).
choose_move_aux(AllGames, [Game | Games], Chosen, !IO) :-
print_board(Game^board, !IO), nl(!IO),
write_string("Ok? ", !IO),
read_line_as_string(Res, !IO), nl(!IO),
(
Res = ok(String),
(prefix(String, "y") -> Chosen = ok(Game);
choose_move_aux(AllGames, Games, Chosen, !IO))
;
Res = eof, Chosen = eof
;
Res = error(Err), Chosen = error(Err)
).</code></pre><br />
<h3><code>checkers</code> module</h3>Finally we can wrap everything together with our main module, <code>checkers</code>.<br />
<br />
<h4>Players</h4>To make our code modular, let’s define a typeclass <code>player</code>:<br />
<pre><code>:- import_module game, io.
:- typeclass player(Player) where [
pred make_move(Player::in, game::in, game::out, io::di, io::uo)
is cc_multi
].</code></pre>This means that a <code>player</code> is any type <code><var>Player</var></code> which has defined for it a <code>make_move</code> predicate with the given signature. This allows us to implement various “players”, each with a different <code>choose_move</code> predicate.<br />
<br />
By including the IO state and declaring the determinism of <code>make_move</code> as <code>cc_multi</code>, we allow both for players which must interact with IO (e.g., humans), and for players which may choose a move arbitrarily (e.g., search algorithms which find several optimal choices).<br />
<br />
Now we can implement various human and AI players which conform to this interface. Since we’re just testing for now, we’ll define a human player like so:<br />
<pre><code>:- import_module game_io, list, require, solutions.
:- type human_player ---> human_player.
:- instance player(human_player) where [
(choose_move(human_player, Game0, Game1, !IO) :-
unsorted_solutions(
pred(Game::out) is nondet :- step(_, _, Game0, Game),
Moves),
(
Moves = [_|_],
make_move(Moves, Res, !IO),
(
Res = ok(Game1)
;
Res = eof, error("End of file")
;
Res = error(Err), error(error_message(Err))
)
;
Moves = [],
unexpected($module, $pred, "Game is not over but no more moves!")
)
)
].</code></pre>(The dummy type <code>human_player</code> is needed so we can declare it as an instance of the typeclass <code>player</code>.) <code>unsorted_solutions</code> returns an arbitrarily ordered list of all the possible moves from a given position. We assert that this list should not be empty (as the player should not have been asked to make a move when there are no moves available, as this means that the game is over!), and forward the rest of the work to our previously-defined <code>game_io.choose_move</code>.<br />
<br />
We can also define a dummy AI that chooses a move arbitrarily like so:<br />
<pre><code>:- type dummy_player ---> dummy_player.
:- instance player(dummy_player) where [
(make_move(dummy_player, Game0, Game1, !IO) :-
unsorted_solutions(
pred(Game::out) is nondet :- step(_, _, Game0, Game),
Moves),
Game1 = det_head(Moves))
].</code></pre><br />
<h5>Why typeclasses?</h5>You may rightly ask, “why not simply declare a type <code>player ---> human_player; dummy_player</code> and forget the whole typeclass business?” By using typeclasses, we can provide new player implementations in other modules, rather than stuffing them all into one possibly large module.<br />
<br />
Of course, another way of doing this would be using higher-order predicates: we could declare players as a predicate type. The problem with this approach is that we may at some point determine that a player should have more predicates associated with it (such as one defining their name or difficulty), at which point we will need to pass around multiple predicates having messy types and instantiations. Using typeclasses allows us to redefine what a <code>player</code> is more easily.<br />
<br />
<h4>The main loop</h4>Last but not least, we will define a loop <code>play_game</code>, which takes a game state and two values whose types are instances of the <code>player</code> typeclass to represent the two players. The implementation is straightforward:<br />
<pre><code>:- pred play_game(game::in, Red::in, White::in, io::di, io::uo) is cc_multi
<= (player(Red), player(White)).
play_game(Game0, Red, White, !IO) :-
print_game(Game0, !IO), nl(!IO),
(game_over(Game0, _) -> true;
(
Game0^turn = red,
make_move(Red, Game0, Game1, !IO)
;
Game0^turn = white,
make_move(White, Game0, Game1, !IO)
),
play_game(Game1, Red, White, !IO)
).</code></pre>In this predicate, <code>make_move</code> will call the <code>make_move</code> predicate associated with the type of <code><var>Red</var></code> or <code><var>White</var></code> as appropriate.<br />
<br />
Last but not least, the <code>main</code> predicate will start a game between two players (in this case, a <code>human_player</code> and a <code>dummy_player</code>):<br />
<pre><code>:- pred main(io::di, io::uo) is cc_multi.
main(!IO) :- play_game(init, human_player, dummy_player, !IO).</code></pre><br />
<h3>Trying it out</h3>Thanks to Mercury’s built-in build system, we can compile the whole shebang with simply <code>mmc --make checkers</code>. Running it lets us play a game versus the dummy AI:<br />
<pre><samp> o o o o
o o o o
o o o o
· · · ·
· · · ·
x x x x
x x x x
x x x x
X's move.</samp></pre>It’s our move first. Let’s cycle through the options:<br />
<pre><samp> o o o o
o o o o
o o o o
· · · ·
· · · x
x x x ·
x x x x
x x x x
Ok? <kbd>n</kbd>
o o o o
o o o o
o o o o
· · · ·
· · x ·
x x x ·
x x x x
x x x x
Ok? <kbd>n</kbd>
o o o o
o o o o
o o o o
· · · ·
· · x ·
x x · x
x x x x
x x x x
Ok? <kbd>n</kbd>
o o o o
o o o o
o o o o
· · · ·
· x · ·
x x · x
x x x x
x x x x
Ok? <kbd>n</kbd>
o o o o
o o o o
o o o o
· · · ·
· x · ·
x · x x
x x x x
x x x x
Ok? <kbd>n</kbd>
o o o o
o o o o
o o o o
· · · ·
x · · ·
x · x x
x x x x
x x x x
Ok? <kbd>n</kbd>
o o o o
o o o o
o o o o
· · · ·
x · · ·
· x x x
x x x x
x x x x
Ok? <kbd>n</kbd>
Repeating...</samp></pre>OK, we’ve seen all of them. Let’s choose the first:<br />
<pre><samp> o o o o
o o o o
o o o o
· · · ·
· · · x
x x x ·
x x x x
x x x x
Ok? <kbd>y</kbd>
o o o o
o o o o
o o o o
· · · ·
· · · x
x x x ·
x x x x
x x x x </samp></pre>Let’s make sure the computer picks a valid move.<br />
<pre><samp>O's move.
o o o o
o o o o
o o o ·
· · · o
· · · x
x x x ·
x x x x
x x x x </samp></pre>Hooray! Of course, now it’s our move again. This could go on for a while. Testing it proves that everything works however!<br />
<br />
<h3>Conclusion</h3>Today we covered building the game logic in the <code>moves</code> and <code>game</code> module, and I/O and the main loop in the <code>game_io</code> and <code>checkers</code> modules. Next week we’ll finish up by adding an AI based on the minimax algorithm.Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com2tag:blogger.com,1999:blog-1560635160713980237.post-8369340439589531532011-11-29T18:08:00.000-08:002011-12-05T08:59:27.597-08:00Checkers AI, part 1Logic languages such as Mercury love search problems. Game artifical intelligence is a great application of searches. AIs for many zero-sum two-player board games, such as checkers and chess, can be implemented as a <a rel="external" href="http://en.wikipedia.org/wiki/Minimax">minimax</a> search; a search which assumes that each player makes a move which maximally benefits himself. In today’s post, I’d like to walk through modeling <a rel="external" href="http://en.wikipedia.org/wiki/English_draughts">checkers</a> in Mercury.<br />
<br />
<h3>Architecture</h3>The overall architecture of our checkers game will consist of the following modules:<br />
<dl><dt><code>color</code></dt>
<dd>The two colors (players) in the game and their relation (opponents).</dd>
<dt><code>piece</code></dt>
<dd>The four kinds of pieces in the game and functions over them (namely, kinging).</dd>
<dt><code>position</code></dt>
<dd>A representation of the position of a piece, and predicates defining useful relationships between positions (namely, moving and jumping).</dd>
<dt><code>board</code></dt>
<dd>A representation of the checkers board, predicates about positions on it, predicates about pieces on it, and predicates to modify a board in useful ways (namely, moving, removing, and kinging pieces).</dd>
<dt><code>moves</code></dt>
<dd>Predicates to update a board via the patterns described in <code>position</code>.</dd>
<dt><code>game</code></dt>
<dd>An encoding of the state of a game, a predicate to modify it (by making a move according to the rules of checkers), and predicates to query the state of a game (namely, whether the game has ended, and the result if so).</dd>
<dt><code>util</code></dt>
<dd>Some utility functions for dealing with integers and lists; used primarily by <code>minimax</code>.</dd>
<dt><code>minimax</code></dt>
<dd>The minimax algorithm and related typeclasses.</dd>
<dt><code>heuristic</code></dt>
<dd>A declaration of the game as a minimax problem, with associated heuristics.</dd>
<dt><code>game_io</code></dt>
<dd>Predicates for printing various game structures in a human-readable format, and obtaining user input.</dd>
<dt><code>checkers</code></dt>
<dd>The main module, allowing configuration of the game and defining the overall UI.</dd> </dl>This architecture ensures that modules are loosely coupled and small in size (< 120 loc each, average ~ 60 loc), making modifications and debugging easy. A similar overall architecture should prove useful for other game AI implementations.
Let’s go into each module in detail. In today’s post, we’ll focus on <code>color</code>, <code>piece</code>, <code>position</code>, and <code>board</code>.<br />
<br />
<h3><code>color</code> module</h3>There are two colors in checkers, corresponding to the two players:<br />
<pre><code>:- type color ---> red; white.</code></pre>They are related as opponents:<br />
<pre><code>:- pred opponent(color, color).
:- mode opponent(in, out) is det.
:- mode opponent(out, in) is det.
opponent(red, white).
opponent(white, red).</code></pre>We can also define an <code>opponent</code> function for convenience:<br />
<pre><code>:- func opponent(color) = color.
opponent(Color) = Opponent :- opponent(Color, Opponent).</code></pre>That’s it! Short and to-the-point modules are good. We’ll see that most modules in this game are similarly sized.<br />
<br />
<h3><code>piece</code> module</h3>Pieces come in two colors, defined in <code>color</code>, and can be either men or kings:<br />
<pre><code>:- import_module color.
:- type kind ---> man; king.
:- type piece ---> piece(color::color, kind::kind).</code></pre>It is also useful to talk about “kinging” a piece; changing a man into a king:<br />
<pre><code>:- func king(piece) = piece is semidet.
king(piece(Color, man)) = piece(Color, king).</code></pre>This module is also short and sweet.<br />
<br />
<h3><code>position</code> module</h3>A position is simply an X, Y tuple:<br />
<pre><code>:- type position ---> position(int, int).</code></pre>It is useful to talk about pieces changing position by moving and jumping. Let’s declare two predicates, <code>move</code> and <code>jump</code>, which will define the relationship between the starting and ending locations of a piece which moves or jumps, and in the case of a jump, the position which was jumped over.<br />
<pre><code>:- pred move(piece, position, position).
:- pred jump(piece, position, position, position).</code></pre>It will be most useful to know whether a particular change in position describes a move or jump, and given a starting position, what ending positions would result from a move or jump. We will declare two modes, corresponding to these use cases, for each predicate:<br />
<pre><code>:- mode move(in, in, in) is semidet.
:- mode move(in, in, out) is multi.
:- mode jump(in, in, out, in) is semidet.
:- mode jump(in, in, out, out) is multi.</code></pre>Because Mercury provides bidirectional arithmetic (i.e. <code>+</code> and <code>-</code> can deduce the values of one operand if the result is known), we can define these predicates with only one clause each:<br />
<pre><code>:- import_module color.
:- import_module int.
move(Piece, position(X, Y), position(X + XD, Y + YD)) :-
(XD = 1; XD = -1),
(Piece = piece(red, man), YD = 1;
Piece = piece(white, man), YD = -1;
Piece = piece(_, king), YD = 1;
Piece = piece(_, king), YD = -1).
jump(Piece, position(X, Y), position(X + XD, Y + YD),
position(X + XD2, Y + YD2)) :-
(XD = 1, XD2 = 2; XD = -1, XD2 = -2),
(Piece = piece(red, man), YD = 1, YD2 = 2;
Piece = piece(white, man), YD = -1, YD2 = -2;
Piece = piece(_, king), YD = 1, YD2 = 2;
Piece = piece(_, king), YD = -1, YD2 = -2).</code></pre>Unfortunately any attempt to compact this code results in mode errors. This is because Mercury’s switch detection code is not very robust in the face of complicated switches, and because Mercury seems not to reorder conjunctions to obtain mode-correctness.<br />
<br />
<h3><code>board</code> module</h3>First we define the type of a board, which should contain its width, height, a mapping from position to piece, and (for efficiency) a count (using the handy <a rel="extern" href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_library/bag.html"><code>bag</code></a> module) of how many of each type of piece is on the board:<br />
<pre><code>:- import_module color, piece, position.
:- import_module bag, int, map.
:- type board --->
board(
width::int,
height::int,
positions::map(position, piece),
counts::bag(piece)
).</code></pre>Of course we could fix the width and height to 8, but it is easy and interesting to allow for other size boards. Since Mercury <a rel="external" href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_ref/Field-selection.html">automatically generates</a> “getter” functions for <code>width</code> and <code>height</code>, we need only declare them in our interface, even if we choose to hide the type of <code>board</code>:<br />
<pre><code>:- func width(board) = int.
:- func height(board) = int.</code></pre>Another trivial but important state accessor is <code>count_pieces</code>, which counts the number of a given kind of piece on the board:<br />
<pre><code>:- func count_pieces(board, piece) = int.
count_pieces(Board, Piece) = count_value(Board^counts, Piece).</code></pre>This function is implemented using the bag we added to the <code>board</code> type so that it may be quickly executed by our heuristic function (defined later).<br />
<br />
<h4>Initializing the board</h4>Next we define how to initialize a board of a given size:<br />
<pre><code>:- func init(int, int) = board.
init(Width, Height) = !:Board :-
!:Board = board(Width, Height, init, init),
fold_up(init_row(piece(red, man)), 0, Height / 2 - 2, !Board),
fold_up(init_row(piece(white, man)), (Height + 1) / 2 + 1, Height - 1, !Board).</code></pre>The <code>int.fold_up</code> predicate used here is Mercury’s idea of a for loop. <code>init_row</code> uses <code>solutions.aggregate</code> and <code>legal_position</code> (defined later) to iterate over the valid positions in a row:<br />
<pre><code>init_row(Piece, Y, !Board) :-
aggregate(
(pred(Pos::out) is nondet :-
legal_position(!.Board, position(_, Y) @ Pos)),
(pred(Pos::in, !.Board0::in, !:Board0::out) is det :-
!Board0^positions :=
det_insert(!.Board0^positions, Pos, Piece),
!Board0^counts :=
insert(!.Board0^counts, Piece)),
!Board).</code></pre>Note that we call <code>legal_position</code> with a partially instantiated position, <code>position(_, Y)</code>, which is also bound to <code><var>Pos</var></code>. This means that we need a mode of <code>legal_position</code> which can fill in valid X coordinates of a given row.<br />
<br />
<h4>Legal positions on the board</h4>Here is a suitable declaration of <code>legal_position</code>:<br />
<pre><code>:- pred legal_position(board, position).
:- mode legal_position(in, in) is semidet.
:- mode legal_position(in, out) is nondet.
:- mode legal_position(in, bound(position(ground, free)) >> ground) is nondet.
:- mode legal_position(in, bound(position(free, ground)) >> ground) is nondet.</code></pre>All those modes! Beside the modes where the <code>position</code> argument is <code>in</code> and <code>out</code> (corresponding to checking a legal position and generating all legal positions), the other two modes work with partially instantiated positions, where we wish to know the legal positions in a given column or row (respectively). Thankfully, all four modes can be implemented with one body:<br />
<pre><code>legal_position(Board, position(X, Y)) :-
int_in_range(0, Board^height - 1, Y),
int_in_range(0, Board^width - 1, X),
(even(X), even(Y); odd(X), odd(Y)).</code></pre>Unfortunately the predicate <code>nondet_int_in_range</code> in Mercury’s <a rel="external" href="http://www.mercury.csse.unimelb.edu.au/information/doc-latest/mercury_library/int.html"><code>int</code></a> module does not provide a <code>semidet</code> mode and is thus unsuitable for bidirectional programming. Instead, we declare a similar predicate, <code>int_in_range</code>, in our <code>util</code> module:<br />
<pre><code>:- pred int_in_range(int, int, int).
:- mode int_in_range(in, in, in) is semidet.
:- mode int_in_range(in, in, out) is nondet.</code></pre>To define this predicate, we can make use of Mercury’s support for defining <a rel="external" href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_ref/Different-clauses-for-different-modes.html">different clauses for different modes</a>:<br />
<pre><code>:- pragma promise_equivalent_clauses(int_in_range/3).
int_in_range(Lo::in, Hi::in, X::out) :- nondet_int_in_range(Lo, Hi, X).
int_in_range(Lo::in, Hi::in, X::in) :- Lo =< X, X =< Hi.</code></pre>Because we know that these two clauses are logically equivalent, we promise such to Mercury using <code>promise_equivalent_clauses</code>.<br />
<br />
<h4>Locating pieces</h4>To query which pieces are at which positions, we declare <code>piece_at</code>. <code>piece_at</code> has two modes, one for checking which piece (if any) is at a given position, and one for iterating over all pieces currently on the board:<br />
<pre><code>:- pred piece_at(board, position, piece).
:- mode piece_at(in, in, out) is semidet.
:- mode piece_at(in, out, out) is nondet.</code></pre>Our implementation of <code>piece_at</code> should be a thin layer over the <code>positions</code> map, but Mercury’s <a rel="external" href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_library/map.html"><code>map</code></a> module does not provide one predicate which serves both purposes. Again, we can remedy this by using different clauses for each mode:<br />
<pre><code>:- pragma promise_equivalent_clauses(piece_at/3).
piece_at(Board::in, Position::in, Piece::out) :-
search(Board^positions, Position, Piece).
piece_at(Board::in, Position::out, Piece::out) :-
member(Board^positions, Position, Piece).</code></pre>We can also define <code>no_piece_at</code>, which has an identical signature, as:<br />
<pre><code>no_piece_at(Board, Position) :-
legal_position(Board, Position),
not piece_at(Board, Position, _).</code></pre>Note that <code>no_piece_at(…)</code> is not simply <code>not piece_at(…)</code>! By additionally requiring that <code><var>Position</var></code> be a legal position, we restrict the domain of <code>no_piece_at</code> and thus allow it to output the finite number of positions where no piece exists (as opposed to the infinite number of positions which either have no piece or are not legal).<br />
<br />
<h4>Modifying the board</h4>There are three predicates in <code>board</code> to modify the board: <code>move_piece</code>, <code>remove_piece</code>, and <code>king_piece</code>. <code>move_piece</code> and <code>remove_piece</code> have modes which allow for both modifying a given location(s), and for generating all possible such modifications of the board. <code>king_piece</code> has only one mode to modify a given location. All three have similar code utilizing <a rel="external" href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_ref/State-variables.html">state variable</a> <a rel="external" href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_ref/Record-syntax.html">field update syntax</a>. Here is the implementation of <code>king_piece</code> as an example:<br />
<pre><code>king_piece(Pos, !Board) :-
piece_at(!.Board, Pos, Piece),
King = king(Piece),
!Board^positions := update(!.Board^positions, Pos, King),
!Board^counts := det_remove(!.Board^counts, Piece),
!Board^counts := insert(!.Board^counts, King).</code></pre>It is my opinion that Mercury would benefit from an extension to state variable update syntax which allows state variable record syntax for data terms. Such a syntax would, in concert with the standard library argument order modifications introduced in Mercury 11.07, allow the above predicate to be written more concisely as:<br />
<pre><code>king_piece(Pos, !Board) :-
piece_at(!.Board, Pos, Piece),
King = king(Piece),
update(Pos, King, !Board^positions),
det_remove(Piece, !Board^counts),
insert(King, !Board^counts).</code></pre>(Of course, the update functions allowed to be used with such a syntax must be restricted to those which commute with each other.)<br />
<br />
<h3>Review</h3>That’s all for now, this post is long enough! Today we saw the overall structure of our checkers AI – ten small modules – and the implementations of <code>color</code>, <code>piece</code>, <code>position</code>, and <code>board</code>, the modules defining the “parts” of the game. Next week we’ll see the implementations of the “rules” and “brains” of the game.Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com0tag:blogger.com,1999:blog-1560635160713980237.post-13288063593065523212011-11-09T19:47:00.000-08:002011-12-01T14:16:34.067-08:00Making the gradeI know it’s been a while since my last post, but I have something fantastic in the works, I promise. Today though I’d like to talk about the compiler grades.<br />
<br />
<h3>Common grades</h3>Compiler <defn>grades</defn> correspond roughly to combinations of different backends (e.g. C or Java) and code models (e.g. parallel or debugging). Those of you new to the language may be confounded by the immense number of these grades, and when initially compiling Mercury it can be difficult to decide which grades to enable. Here’s a list of the ones I use most. In parentheses next to the grade names are the options needed to select the grade, assuming that asm_fast.gc is the default (it usually is).<br />
<dl><dt>asm_fast.gc (default)</dt>
<dd>This is my default go-to grade. It compiles to low-level C code using the Boehm garbage collector and supports concurrency (though not parallelism). It is therefore very fast, can interface seamlessly with C libraries, and can run code that uses threads.</dd>
<dt>asm_fast.par.gc.stseg (<code>--parallel --stack-segments</code>)</dt>
<dd>This grade adds parallelism and stack segmentation to the asm_fast.gc grade. Parallelism allows programs to make use of multiple cores (though note that this isn’t necessary if all you need is concurrent execution). Stack segmentation allows pieces of the stack to be allocated on the heap. Beside allowing programs to grow beyond a fixed stack size, this option (only available for the asm_fast grades) <em>greatly</em> reduces the amount of memory allocated by thread spawns, which can otherwise measure in the gigabytes. Both these features come at a small runtime cost.</dd>
<dt>hlc.gc (<code>-H</code>)</dt>
<dd>This is the “high-level C” grade. It compiles to C code which is much closer to what a programmer might write than that generated by the asm_fast grade. Curiously this grade is sometimes faster than the low-level grade (likely due to an increased number of optimization opportunities available to GCC), so if you need speed, it is worth testing your application in this grade as well. Unlike asm_fast however, this grade does not support concurrency.</dd>
<dt>hlc.par.gc (<code>-H --parallel</code>)</dt>
<dd>This grade adds to hlc.gc support for concurrency and parallelism via native system threads. If your application requires a 1:1 mapping of Mercury threads to system threads, then this grade is a must.</dd>
<dt>java (<code>--java</code>)</dt>
<dd>As the name implies, this grade compiles to Java code. It is the only option if you need to interface with Java libraries. Due to the JVM’s speed the code it generates is often as fast as its C counterparts. This grade supports concurrency and parallelism via Java’s built-in threading mechanisms (<code>--parallel</code> is not needed). (Note that unlike in the C grades, a program compiled under the java grade will not wait for all spawned threads to terminate before exiting.) Unfortunately some standard library procedures are not yet implemented.</dd>
<dt>asm_fast.gc.prof (<code>-p</code>); hlc.gc.prof (<code>-H -p</code>)</dt>
<dd>These grades allow you to profile your code using <a rel="external" href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_user_guide/Profiling.html">mprof</a>, of course at the expense of execution speed. They are handy to have around.</dd>
<dt>asm_fast.gc.debug (<code>--debug</code>)</dt>
<dd>This grade allows you to debug your code using <a rel="external" href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_user_guide/Debugging.html">mdb</a>. Note that if you’re not debugging the standard library, then instead of compiling this grade, you can instead pass the <code>--trace deep</code> option to the compiler to enable debugging of just your modules.</dd> </dl><br />
<h3>Compiling extra grades</h3>We all know that the Mercury build process takes <em>forever</em>. This is because it must compile the standard library in each of the required grades. Fortunately there’s a shortcut around this.<br />
<br />
The first time you install the system, instead of <code>make install</code>, run <code>make install LIBGRADES=asm_fast.gc</code>. This will install the compiler, but only the default library grade. You can pass <code>LIBGRADES=</code> if you wish instead to install no library grades.<br />
<br />
Later, when you wish to install additional grades, run <code>make install_grades LIBGRADES='asm_fast.par.gc hlc.gc'</code> (for example). This will compile only the given library grades. To complete the installation, you should edit /path/to/mercury/lib/mercury/conf/Mercury.config to add the appropriate <code>--libgrade</code> flags to the end of <code>DEFAULT_MCFLAGS</code> (near the end of the file), and /path/to/mercury/lib/mercury/mmake/Mmake.vars to add the grade names to the definition of <code>LIBGRADES</code>. (These steps are only necessary if you plan to build and install libraries.)<br />
<br />
Don’t forget also that you can speed up compilation by passing <code>PARALLEL=-j5</code> (where 5 is one plus the number of parallel cores in your system) to <code>make</code> (or simply <code>-j5</code> to <code>mmake</code>).<br />
<br />
<h3>Working with multiple grades</h3>Finally, when testing with multiple grades, you may with to use the <code>--use-grade-subdirs</code> option in conjunction with <code>mmc --make</code>. This will cause <code>mmc --make</code> to use a different subdirectory for each grade in which you build your project, avoiding errors caused by linking incompatible grades, and allowing you not to have to rebuild the entire project each time you switch grades.Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com0tag:blogger.com,1999:blog-1560635160713980237.post-43994403339371298972011-10-17T21:56:00.000-07:002011-12-19T06:18:20.069-08:00Parsing Expression Grammars part 3 (final)So, <a href="http://adventuresinmercury.blogspot.com/2011/10/parsing-expression-grammars-part-2.html">last week</a>, we built a C99 parser. A <em>slow</em> C99 parser. At the end we added one line of code that sped things up a bunch. Let’s look at that.<br />
<br />
<h3>Why recursive descent parsing is slow</h3>Right now our PEG parser is what’s known as a <a rel="external" href="http://en.wikipedia.org/wiki/Recursive_descent_parser">recursive descent parser</a> – it recursively descends down the parse tree looking for matches. Unfortunately, recursive descent is very slow for two reasons:<br />
<ol><li>One grammar rule may try to parse a given span of input the same way multiple times.</li>
<li>Multiple grammar rules may try to parse a given span of input the same way multiple times.</li>
</ol>An example of the former is in the <code>selection_statement</code> production (<code>Typedefs</code> omitted for clarity):<br />
<pre><code>selection_statement -->
(kw("if"), p("("), expression, p(")"), statement, kw("else"), statement) -> [];
(kw("if"), p("("), expression, p(")"), statement) -> [];
(kw("switch"), p("("), expression, p(")"), statement).</code></pre>The string “<code>if (x > 0) x--; return;</code>” will be parsed once by the first branch, which fails when it finds “<code>return</code>” instead of “<code>else</code>”, and will then be <em>reparsed</em> by the second branch, which will succeed. This sort of prefix-repetition can be eliminated by factoring out the repeated portion:<br />
<pre><code>selection_statement -->
(kw("if"), p("("), expression, p(")"), statement,
((kw("else"), statement) -> []; [])) -> [];
(kw("switch"), p("("), expression, p(")"), statement).</code></pre>Of course, this means that our code no longer is structurally identical to the grammar we are parsing, but this may be a welcome tradeoff.<br />
<br />
An example of the latter, which is more pernicious, is given by, well, <em>almost every production</em>. Why? Consider <code>statement</code>:<br />
<pre><code>statement -->
labeled_statement -> [];
compound_statement -> [];
expression_statement -> [];
selection_statement -> [];
iteration_statement -> [];
jump_statement.</code></pre>Pretty innocuous, right? Take a look at each of the productions it invokes. Almost all of them start with a call to <code>kw</code>, which in turn calls <code>token</code>. In fact, <em>all</em> of them start with a call, directly or indirectly, to <code>token</code>. In fact, this is true of <em>every production in the parser</em>. <code>token</code> is not the only culprit but it is the most pervasive.<br />
<br />
This problem can’t be solved through a simple restructuring of the grammar. For this we need a more powerful tool: memoization.<br />
<br />
<h3>Memoization</h3><a rel="external" href="http://en.wikipedia.org/wiki/Memoization">Memoization</a> is one of my favorite features. Simply put, <defn>memozation</defn> of a procedure means maintaining a table of its past input and output values, and looking up output values in this table when the procedure is called in the future with the same input values. Of course, this only works if the procedure has no desirable side effects; fortunately this is true of most Mercury code. Mercury provides built-in <a rel="external" href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_ref/Tabled-evaluation.html">support</a> for memoization via the <code>memo</code> pragma. Basic usage is so:<br />
<pre><code>:- pragma memo(<var>Name</var>/<var>Arity</var>).</code></pre>By default, Mercury memoizes inputs by value. This will catch all repeated calls to a procedure, but can be slow for large structures. We can specify that we would like Mercury to memoize inputs by address instead:<br />
<pre><code>:- pragma memo(<var>Name</var>/<var>Arity</var>, [fast_loose]).</code></pre>Memoizing by address is much faster but may miss calls with identical values at different addresses. However, because our primary data structure is a list of characters which is very large (meaning memoize-by-value will be slow) and which we are only destructing (meaning identical sublists of this list must be physically the same), we will memoize by address.<br />
<br />
<h3>Profiling</h3>Now, although we could memoize every procedure in our lexer and parser, memoization <em>does</em> have overhead, and so we should not use it on procedures which are not called often or with repeated inputs. We should limit our scope to grammar rules which are invoked by many different rules, or multiple times by one rule, since these are likely to be invoking the rule repeatedly on the same input.<br />
<br />
To find such candidates for memoization, we can use runtime code profiling. To generate profiling data, we can simply compile with the <code>-p</code> switch and then run our program with our test from last week (minus the memoization of <code>token</code> we added at the end):<br />
<pre>$ <kbd>mmc -p --infer-all --make c_parser_direct</kbd>
$ <kbd>echo 'int main(void) {}' | ./c_parser_direct</kbd></pre>Let that churn for a few seconds, and then kill it with <kbd>Ctrl+C</kbd>. Now run <code>mprof</code> to find out what’s been taking so long:<br />
<div style="overflow:scroll;overflow-y:hidden"><pre style="width:160em"><samp> % cumulative self self total
time seconds seconds calls ms/call ms/call name
68.5 8.70 8.70 17602295 0.00 0.00 <function `string.to_char_list/1' mode 0> [1]
7.1 9.60 0.90 17602483 0.00 0.00 <predicate `builtin.unify/2' mode 0> [2]
6.3 10.40 0.80 202300 0.00 0.00 <predicate `c_parser_aim.lexer.punctuator_rec/4' mode 0> [3]
5.9 11.15 0.75 202388 0.00 0.00 <predicate `c_parser_aim.lexer.keyword_rec/4' mode 0> [4]
4.7 11.75 0.60 10114349 0.00 0.00 <predicate `c_parser_aim.lexer.punctuator_aux/3' mode 0> [5]
4.3 12.30 0.55 7487945 0.00 0.00 <predicate `c_parser_aim.lexer.keyword_aux/3' mode 0> [6]</samp></pre></div>Over 95% of that run was spent just checking for keywords and punctuators! In particular, the biggest culprit, <code>string.to_char_list</code>, is highly suspect. To find out who’s calling it, we can run <code>mprof -c</code> [note: sometimes I have had to run <code>mprof -cd</code>; I can’t reproduce this now so I’m not sure why] and search for the callers of <code>string.to_char_list</code>:<br />
<div style="overflow:scroll;overflow-y:hidden"><pre style="width:160em"><samp> 3.70 3.70 7487945/17602295 <predicate `c_parser_aim.lexer.keyword_rec/4' mode 0> [24]
5.00 5.00 10114350/17602295 <predicate `c_parser_aim.lexer.punctuator_rec/4' mode 0> [22]
[21] 68.5 8.70 8.70 17602295 <function `string.to_char_list/1' mode 0> [21]</samp></pre></div>The <samp>[21]</samp> in the left-most column indicates that this section is dedicated to procedure #21, <code>string.to_char_list</code>, which was called 17,602,295 times. The two lines above this line indicate that both <code>lexer.keyword_rec</code> and <code>lexer.punctuator_rec</code> call <code>string.to_char_list</code>, 7,487,945 and 10,114,350 times respectively. Of course; we use <code>string.to_char_list</code> to convert our keywords and punctuators to character lists for these two procedures to match against the input data. But clearly we don't have 7,487,945 keywords or 10,114,350 punctuators, so <code>string.to_char_list</code> must be getting called with the same arguments repeatedly. Let’s fix that:<br />
<pre><code>:- func to_char_list_memo(string) = list(char).
:- pragma memo(to_char_list_memo/1, [fast_loose]).
to_char_list_memo(S) = to_char_list(S).</code></pre>Unfortunately we can’t instruct Mercury to memoize procedures in other modules, so we must define <code>to_char_list_memo</code> as above. We can replace the two calls to it in <code>lexer.keyword_rec</code> and <code>lexer.punctuator_rec</code> with the memoized version. Compiling and reprofiling tells us:<br />
<div style="overflow:scroll;overflow-y:hidden"><pre style="width:160em"><samp> % cumulative self self total
time seconds seconds calls ms/call ms/call name
45.2 1.65 1.65 10400248 0.00 0.00 <function `c_parser_aim.to_char_list_memo/1' mode 0> [22]
12.3 2.10 0.45 10400437 0.00 0.00 <predicate `builtin.unify/2' mode 0> [29]
12.3 2.55 0.45 5975244 0.00 0.00 <predicate `c_parser_aim.lexer.punctuator_aux/3' mode 0> [27]
12.3 3.00 0.45 119606 0.00 0.01 <predicate `c_parser_aim.lexer.keyword_rec/4' mode 0> [24]
8.2 3.30 0.30 4425004 0.00 0.00 <predicate `c_parser_aim.lexer.keyword_aux/3' mode 0> [28]
4.1 3.45 0.15 119517 0.00 0.02 <predicate `c_parser_aim.lexer.punctuator_rec/4' mode 0> [21]</samp></pre></div><br />
<h3>Tracking down <code>token</code></h3>We’ve reduced the percentage of total runtime consumed by <code>to_char_list</code>, but it’s still getting called an awful lot. We know <code>keyword_rec</code> and <code>punctuator_rec</code> don’t call it any more often than they need to, so we need to find out who’s calling <em>them</em>. Let’s follow <code>keyword_rec</code>:<br />
<div style="overflow:scroll;overflow-y:hidden"><pre style="width:160em"><samp> 0.45 1.64 119606/119606 <predicate `c_parser_aim.lexer.keyword/3' mode 0> [23]
[24] 45.0 0.45 1.64 119606 <predicate `c_parser_aim.lexer.keyword_rec/4' mode 0> [24]
0.70 0.70 4425004/10400248 <function `c_parser_aim.to_char_list_memo/1' mode 0> [22]
0.30 0.49 4425004/4425004 <predicate `c_parser_aim.lexer.keyword_aux/3' mode 0> [28]</samp></pre></div><code>keyword</code> is the only caller, no surprise there. Let’s go up one more level:<br />
<div style="overflow:scroll;overflow-y:hidden"><pre style="width:160em"><samp> 0.00 1.64 119606/119606 <predicate `c_parser_aim.lexer.token/3' mode 0> [1]
[23] 45.0 0.00 1.64 119606 <predicate `c_parser_aim.lexer.keyword/3' mode 0> [23]
0.00 0.00 57/239246 <predicate `c_parser_aim.lexer.identifier_nondigit/3' mode 0> [67]
0.45 1.64 119606/119606 <predicate `c_parser_aim.lexer.keyword_rec/4' mode 0> [24]
0.00 0.00 57/836707 <predicate `char.is_digit/1' mode 0> [59]</samp></pre></div>Again, no surprise – <code>keyword</code> is called only by <code>token</code>.<br />
<div style="overflow:scroll;overflow-y:hidden"><pre style="width:160em"><samp> 0.00 0.21 7009/119606 <predicate `c_parser_aim.cast_expression/3' mode 0> [8]
0.00 0.00 2/119606 <predicate `c_parser_aim.compound_statement/3' mode 0> <cycle 3> [55]
0.00 0.00 117/119606 <predicate `c_parser_aim.declaration_specifiers/4' mode 0> <cycle 4> [35]
0.00 0.00 4/119606 <predicate `c_parser_aim.direct_abstract_declarator/3' mode 0> [52]
0.00 0.00 9/119606 <predicate `c_parser_aim.direct_declarator/4' mode 0> <cycle 2> [50]
0.00 0.00 4/119606 <predicate `c_parser_aim.direct_declarator_rec/3' mode 0> <cycle 2> [47]
0.00 0.00 36/119606 <predicate `c_parser_aim.enum_specifier/3' mode 0> [42]
0.01 0.43 14042/119606 <predicate `c_parser_aim.kw/3' mode 0> [30]
0.00 0.00 3/119606 <predicate `c_parser_aim.labeled_statement/3' mode 0> [53]
0.00 0.00 8/119606 <predicate `c_parser_aim.p/3' mode 0> [51]
0.00 0.00 2/119606 <predicate `c_parser_aim.parameter_list/3' mode 0> <cycle 2> [54]
0.00 0.00 1/119606 <predicate `c_parser_aim.parameter_type_list/3' mode 0> <cycle 2> [56]
0.00 0.00 26/119606 <predicate `c_parser_aim.pointer/2' mode 0> [45]
0.01 0.43 14019/119606 <predicate `c_parser_aim.postfix_expression/3' mode 0> [25]
0.01 0.86 28036/119606 <predicate `c_parser_aim.primary_expression/3' mode 0> [26]
0.00 0.00 48/119606 <predicate `c_parser_aim.struct_or_union/2' mode 0> [41]
0.00 0.00 36/119606 <predicate `c_parser_aim.type_qualifier/2' mode 0> [43]
0.00 0.00 120/119606 <predicate `c_parser_aim.type_specifier/3' mode 0> [37]
0.00 0.00 12/119606 <predicate `c_parser_aim.typedef_name/3' mode 0> [48]
0.02 1.71 56072/119606 <predicate `c_parser_aim.unary_expression/3' mode 0> [20]
[1] 100.0 0.05 3.65 119606 <predicate `c_parser_aim.lexer.token/3' mode 0> [1]
0.00 0.00 119517/119517 <function `c_parser_aim.lexer.punctuator_list/0' mode 0> [66]
0.00 0.00 31/31 <function `string.from_char_list/1' mode 0> [60]
0.00 0.10 119517/119517 <predicate `c_parser_aim.lexer.constant/2' mode 0> [32]
0.00 0.00 119548/239246 <predicate `c_parser_aim.lexer.identifier_nondigit/3' mode 0> [67]
0.00 0.00 31/31 <predicate `c_parser_aim.lexer.identifier_rec/3' mode 0> [71]
0.00 1.64 119606/119606 <predicate `c_parser_aim.lexer.keyword/3' mode 0> [23]
0.15 1.81 119517/119517 <predicate `c_parser_aim.lexer.punctuator_rec/4' mode 0> [21]
0.00 0.05 119606/119606 <predicate `c_parser_aim.lexer.whitespace/2' mode 0> [34]</samp></pre></div>And there’s the culprit. <code>token</code> is called by <em>many</em> other procedures, and as discussed above, almost certainly with repeated inputs. Let’s memoize <code>token</code>.<br />
<pre><code>:- pragma memo(token/3, [fast_loose]).</code></pre>Now (as we found out last week), parsing our simple test completes in a reasonable amount of time. Let’s give our parser something bigger to chew on.<br />
<br />
<h3>Factoring out multiple calls from <code>multiplicative_expression</code></h3>For stress testing, I used a 6kb program from Glen McCluskey’s <a rel="external" href="http://www.glenmccl.com/c9suite.htm">C99 test suite</a>, <code>tdes_047.c</code>. (It is available in the set of free samples if you wish to try it yourself.) We can use this (or any C99 code) as follows:<br />
<pre>$ <kbd>cpp -P c9_samp/tdes_047.c | ./c_parser_direct</kbd></pre>And of course, we are met with no end to the parsing. <code>mprof</code> still tells us that <code>token</code> is responsible for the most CPU usage, but since we’ve already memoized it, let’s work our way up the chain some more. Who is responsible for most of the calls to <code>token</code>?<br />
<div style="overflow:scroll;overflow-y:hidden"><pre style="width:160em"><samp> 0.23 0.23 1358257/95678933 <predicate `c_parser_aim.additive_expression/3' mode 0> <cycle 2> [19]
0.00 0.00 7546/95678933 <predicate `c_parser_aim.and_expression/3' mode 0> <cycle 2> [24]
0.00 0.00 2370/95678933 <predicate `c_parser_aim.assignment_operator/2' mode 0> [29]
1.36 1.36 8218754/95678933 <predicate `c_parser_aim.cast_expression/3' mode 0> <cycle 2> [12]
0.00 0.00 7/95678933 <predicate `c_parser_aim.compound_statement/3' mode 0> [14]
0.00 0.00 236/95678933 <predicate `c_parser_aim.conditional_expression/3' mode 0> <cycle 2> [50]
0.00 0.00 14/95678933 <predicate `c_parser_aim.declaration/4' mode 0> [5]
0.00 0.00 457/95678933 <predicate `c_parser_aim.declaration_specifiers/4' mode 0> <cycle 2> [43]
0.00 0.00 392/95678933 <predicate `c_parser_aim.designator/3' mode 0> [48]
0.00 0.00 95/95678933 <predicate `c_parser_aim.direct_abstract_declarator/3' mode 0> [53]
0.00 0.00 368/95678933 <predicate `c_parser_aim.direct_declarator/4' mode 0> <cycle 2> [49]
0.00 0.00 890/95678933 <predicate `c_parser_aim.direct_declarator_rec/3' mode 0> <cycle 2> [27]
0.00 0.00 999/95678933 <predicate `c_parser_aim.enum_specifier/3' mode 0> [38]
0.01 0.01 30183/95678933 <predicate `c_parser_aim.equality_expression/3' mode 0> <cycle 2> [23]
0.00 0.00 3773/95678933 <predicate `c_parser_aim.exclusive_or_expression/3' mode 0> <cycle 2> [26]
0.00 0.00 2/95678933 <predicate `c_parser_aim.identifier_list/2' mode 0> [57]
0.00 0.00 1886/95678933 <predicate `c_parser_aim.inclusive_or_expression/3' mode 0> <cycle 2> [31]
0.00 0.00 17/95678933 <predicate `c_parser_aim.init_declarator/4' mode 0> [8]
0.00 0.00 8/95678933 <predicate `c_parser_aim.init_declarator_list/4' mode 0> [7]
0.00 0.00 2/95678933 <predicate `c_parser_aim.initializer/3' mode 0> <cycle 2> [56]
0.00 0.00 48/95678933 <predicate `c_parser_aim.initializer_list/3' mode 0> <cycle 2> [45]
0.02 0.02 139104/95678933 <predicate `c_parser_aim.kw/3' mode 0> [22]
0.00 0.00 943/95678933 <predicate `c_parser_aim.logical_and_expression/3' mode 0> <cycle 2> [39]
0.00 0.00 472/95678933 <predicate `c_parser_aim.logical_or_expression/3' mode 0> <cycle 2> [44]
1.01 1.01 6112154/95678933 <predicate `c_parser_aim.multiplicative_expression/3' mode 0> <cycle 2> [17]
0.00 0.00 1474/95678933 <predicate `c_parser_aim.p/3' mode 0> [34]
0.00 0.00 1172/95678933 <predicate `c_parser_aim.pointer/2' mode 0> [32]
2.72 2.72 16437987/95678933 <predicate `c_parser_aim.postfix_expression/3' mode 0> <cycle 2> [4]
8.10 8.10 48898650/95678933 <predicate `c_parser_aim.postfix_expression_rec/3' mode 0> [9]
2.20 2.20 13264944/95678933 <predicate `c_parser_aim.primary_expression/3' mode 0> [13]
0.03 0.03 181101/95678933 <predicate `c_parser_aim.relational_expression/3' mode 0> <cycle 2> [21]
0.08 0.08 452752/95678933 <predicate `c_parser_aim.shift_expression/3' mode 0> <cycle 2> [20]
0.00 0.00 68/95678933 <predicate `c_parser_aim.struct_declaration/3' mode 0> <cycle 2> [54]
0.00 0.00 136/95678933 <predicate `c_parser_aim.struct_declarator/3' mode 0> <cycle 2> [52]
0.00 0.00 68/95678933 <predicate `c_parser_aim.struct_declarator_list/3' mode 0> <cycle 2> [55]
0.00 0.00 1404/95678933 <predicate `c_parser_aim.struct_or_union/2' mode 0> [36]
0.00 0.00 102/95678933 <predicate `c_parser_aim.struct_or_union_specifier/3' mode 0> <cycle 2> [33]
0.00 0.00 2334/95678933 <predicate `c_parser_aim.type_qualifier/2' mode 0> [30]
0.00 0.00 4079/95678933 <predicate `c_parser_aim.type_specifier/3' mode 0> <cycle 2> [25]
0.00 0.00 333/95678933 <predicate `c_parser_aim.typedef_name/3' mode 0> [51]
0.09 0.09 553352/95678933 <predicate `c_parser_aim.unary_expression/3' mode 0> <cycle 2> [18]
[6] 52.1 15.85 15.85 95678933 <predicate `c_parser_aim.lexer.token/3' mode 0> [6]
0.00 0.00 96/96 <function `c_parser_aim.lexer.punctuator_list/0' mode 0> [81]
0.00 0.00 58/58 <function `string.from_char_list/1' mode 0> [64]
0.00 0.00 108/108 <predicate `c_parser_aim.lexer.constant/2' mode 0> [88]
0.00 0.00 166/412 <predicate `c_parser_aim.lexer.identifier_nondigit/3' mode 0> [86]
0.00 0.00 58/58 <predicate `c_parser_aim.lexer.identifier_rec/3' mode 0> [97]
0.00 0.00 225/225 <predicate `c_parser_aim.lexer.keyword/3' mode 0> [83]
0.00 0.00 96/96 <predicate `c_parser_aim.lexer.punctuator_rec/4' mode 0> [95]
0.00 0.00 225/225 <predicate `c_parser_aim.lexer.whitespace/2' mode 0> [82]</samp></pre></div>It seems like <code>postfix_expression_rec</code> is the winner; it is responsible for over half the calls to <code>token</code>. Tracing it back similarly to how we traced <code>keyword_rec</code>, we find that it is called solely by <code>postfix_expression</code> 8,149,775 times, which is called solely by <code>unary_expression</code> a similar number of times, which is called almost exclusively by <code>cast_expression</code> a similar number of times, which is called almost exclusively by <code>multiplicative_expression</code>, which is called… only 2,054,666 times:<br />
<div style="overflow:scroll;overflow-y:hidden"><pre style="width:160em"><samp> 2054666 <predicate `c_parser_aim.additive_expression/3' mode 0> <cycle 2> [19]
[17] 7.2 0.70 2.18 2054666 <predicate `c_parser_aim.multiplicative_expression/3' mode 0> <cycle 2> [17]
1.01 1.01 6112154/95678933 <predicate `c_parser_aim.lexer.token/3' mode 0> [6]
0.47 0.47 6112154/87459316 <unification predicate for type `c_parser_aim.lexer.token/0' mode 0> [11]
8218660 <predicate `c_parser_aim.cast_expression/3' mode 0> <cycle 2> [12]</samp></pre></div>In other words, <code>multiplicative_expression</code> is responsible for multiplying the number of calls passing through it to <code>token</code> fourfold. Let’s look at <code>multiplicative_expression</code> (again, with <code>Typedefs</code> elided):<br />
<pre><code>multiplicative_expression -->
(cast_expression, p("*"), multiplicative_expression) -> [];
(cast_expression, p("/"), multiplicative_expression) -> [];
(cast_expression, p("%"), multiplicative_expression) -> [];
cast_expression.</code></pre>A-ha, a repeated prefix! We know how to fix this:<br />
<pre><code>multiplicative_expression -->
cast_expression,
((p("*"), multiplicative_expression) -> [];
(p("/"), multiplicative_expression) -> [];
(p("%"), multiplicative_expression) -> [];
[]).</code></pre>Or, if we’d prefer to leave the structure of the grammar intact at the cost of memoization overhead, we can memoize <code>cast_expression</code>:<br />
<pre><code>:- pred cast_expression(list(char)::in, list(char)::out) is semidet.
:- pragma memo(cast_expression/2, [fast_loose]).</code></pre>(Note the <code>:- pred</code> declaration is necessary as you cannot memoize undeclared modes.)<br />
<br />
Notice that most of the other <code>_expression</code> productions follow the same pattern; all of them will in fact benefit from prefix-factoring or memoization. Let’s do that and move on.<br />
<br />
<h3>Memoizing multiple calls to <code>unary_expression</code></h3>Despite all our optimization so far, <code>token</code> (and <code>postfix_expression_rec</code> in turn) still reigns supreme. However, tracing the chain of calls back this time now reveals this:<br />
<div style="overflow:scroll;overflow-y:hidden"><pre style="width:160em"><samp> 170474 <predicate `c_parser_aim.assignment_expression/3' mode 0> <cycle 2> [15]
198268 <predicate `c_parser_aim.cast_expression/3' mode 0> <cycle 2> [32]
[18] 7.2 0.10 0.32 368742 <predicate `c_parser_aim.unary_expression/3' mode 0> <cycle 2> [18]
0.03 0.06 171588/338952 <predicate `c_parser_aim.kw/3' mode 0> [30]
0.13 0.13 686352/12057646 <predicate `c_parser_aim.lexer.token/3' mode 0> [7]
0.03 0.03 686352/11652569 <unification predicate for type `c_parser_aim.lexer.token/0' mode 0> [17]
368742 <predicate `c_parser_aim.postfix_expression/3' mode 0> <cycle 2> [9]</samp></pre></div><code>unary_expression</code> is called approximately an equal number of times by both <code>assignment_expression</code> and <code>cast_expression</code>. <em>This makes it a perfect candidate for memoization:</em><br />
<pre><code>:- pred cast_expression(typedefs::in, list(char)::in, list(char)::out) is semidet.
:- pragma memo(cast_expression/3, [fast_loose]).</code></pre><h3>Ad nauseum</h3>And that’s about all there is to it: following this method I found I needed to memoize two more procedures and factor one more, and the parser parsed the test file faster than <code>time</code> could reliably distinguish. Of course, it would behoove us to test on different inputs in order to exercise different grammar productions, but the method is the same.<br />
<br />
<h3>Summary</h3>To recap:<br />
<ol><li>Use <code>mprof -c</code> to determine which procedure is the largest time hog.</li>
<li>If the procedure is called by primarily one other procedure on a 1:1 basis, or if you have already optimized it, look instead at its most prominent caller (<i>etc</i>.).</li>
<li>If the procedure is called by primarily one other procedure on a greater than 1:1 basis and it is a common prefix of this caller, factor it out.</li>
<li>Otherwise, memoize it.</li>
<li>Recompile and repeat.</li>
</ol>This simple method will asymptotically increase the speed of almost any naïve PEG parser, in my experience, from unusable to quite fast, and will avoid the overhead associated with unnecessarily memoizing every grammar production in a parser.<br />
<br />
Download <a href="http://www.fstutoring.com/~chris/aim/c_parser_memo.m">c_parser_memo.m</a><br />
<br />
<h3>Further reading</h3>The direct encoding of PEGs in Mercury, and the performance implications thereof, were first explored in-depth by Ralph Becket and Zoltan Somogyi in the paper <a rel="external" href="http://www.mercury.csse.unimelb.edu.au/information/papers/packrat.pdf">DCGs + Memoing = Packrat Parsing; But is it worth it?</a>; I highly recommend it as further reading for those interested in the subject.Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com2tag:blogger.com,1999:blog-1560635160713980237.post-53716995088098954922011-10-09T21:41:00.000-07:002011-10-11T17:24:11.921-07:00Parsing Expression Grammars part 2Continuing <a href="http://adventuresinmercury.blogspot.com/2011/10/parsing-expression-grammars-part-1.html">last week’s post</a>, I’d like to give an example of implementing a real-life PEG. What better than the C language, specifically, preprocessed <a href="http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf">C99</a>. (Note: It may be helpful to the reader to open the linked PDF to Annex A, “Language syntax summary”.)<br />
<br />
Recall that PEGs differ from more commonly used CFGs in that they are completely deterministic, greedy, and don’t allow left recursion. This can make translating some grammars difficult or impossible if they rely on those features. Fortunately, C99’s grammar is straightforward enough that, with a little thought, we can work around these problems. Let’s see how.<br />
<br />
<h3>The Lexer</h3><br />
First off, let’s create a submodule for the lexer like so:<br />
<pre><code>:- module lexer.
:- interface.</code></pre>From the lexer we will export two predicates, one which lexes single tokens (<code>token</code>), and one which lexes a sequence of whitespace (<code>whitespace</code>):<br />
<pre><code>:- pred token(token::out, list(char)::in, list(char)::out) is semidet.
:- pred whitespace(list(char)::in, list(char)::out) is det.</code></pre>Note: it’s important that we define the mode of the <code>token</code> argument of <code>token</code> as <code>out</code> rather than <code>in</code>, even though they are logically equivalent. This will allow us later to memoize <code>token</code> on one argument instead of two, greatly reducing our memory use at negligible (if any) processing overhead. This trick is possible precisely because PEGs are deterministic: the determinism of <code>token</code> with <code>token</code> as <code>out</code> will still be <code>semidet</code>, rather than <code>nondet</code> (and hence unmemoizable) as it would be with a CFG.<br />
<br />
The C99 standard defines 5 kinds of lexical tokens our lexer will need to generate: <var>keyword</var>, <var>identifier</var>, <var>constant</var>, <var>string-literal</var>, and <var>punctuator</var>. A glance through the C99 grammar reveals that merely to parse the grammar, we do not need to know the specific values of <var>constant</var>s and <var>string-literal</var>s, so let’s define the type of a lexing token like so:<br />
<pre><code>:- type token --->
keyword(string);
identifier(string);
constant;
string_literal;
punctuator(string).</code></pre>Of course, if we were constructing a parse tree, we would want to hold onto the values of <var>constant</var>s and <var>string-literal</var>s, but for now since we are just matching the grammar it is OK to leave them out. Onto the implementation!<br />
<pre><code>:- implementation.</code></pre><h4>Whitespace, or, using built-in character classes</h4><br />
Let’s get whitespace out of the way. The C99 definition of whitespace corresponds to Mercury’s definition of whitespace, so we can import <code>char</code> and define <code>whitespace</code> like so:<br />
<pre>whitespace --> ([W], {is_whitespace(W)}) -> whitespace; [].</pre><h4>Tokens, or, parameterizing rules</h4><br />
C99 defines <var>token</var> as follows:<br />
<pre><var>token</var>:
<var>keyword</var>
<var>identifier</var>
<var>constant</var>
<var>string-literal</var>
<var>punctuator</var></pre>A straightforward translation of this to a PEG in Mercury would read:<br />
<pre><code>token -->
keyword -> [];
identifier -> [];
constant -> [];
string-literal -> [];
punctuator.</code></pre>However because this is an <em>unordered</em> choice in the spec, but an <em>ordered</em> choice in Mercury, we need to apply a moment’s thought that choosing this ordering will not change the language. Fortunately this is easy to confirm: the latter four each must start with a different character (letter, digit, quote, and non-quote symbol, respectively), and the first (<var>keyword</var>) should be matched in favor of the second (<var>identifier</var>) anyway.<br />
<br />
The C99 spec assumes that whitespace is removed in some phase in-between lexing and parsing, and that string literals are concatenated. Because we will be attaching our lexer directly to our parser (such is the magic of PEGs), we will need to perform these two steps in our tokenizer. Finally, we will want to actually produce a lexing token as well (rather than just match it). The final implementation of <code>token</code> is thus:<br />
<pre><code>token(Token) -->
whitespace,
(keyword(KW) -> {Token = keyword(KW)};
identifier(Ident) -> {Token = identifier(Ident)};
constant -> {Token = constant};
(string_literal, (token(string_literal) -> []; [])) ->
{Token = string_literal};
punctuator(Punct), {Token = punctuator(Punct)}).</code></pre>Parameterizing rules like this can be very powerful, as we’ll see later in this post. However let’s tackle each kind of token in turn.<br />
<br />
<h4>Keywords, or, fixing choice using negative lookahead assertions</h4><br />
<var>keyword</var> looks deceptively simple; it is merely a list of valid C99 keywords. We could be tempted to implement it as such:<br />
<pre><code>keyword(Keyword) --> % WRONG
['a', 'u', 't', 'o'] -> {Keyword = "auto"};
['b', 'r', 'e', 'a', 'k'] -> {Keyword = "break"};
</code><i>etc.</i></pre>But the devil is in the details. We must ensure that no keyword is matched against the prefix of another keyword or identifier (e.g. we should not match the “<kbd>do</kbd>” in the keyword “<kbd>double</kbd>”, or the “<kbd>for</kbd>” in the identifier “<kbd>format</kbd>”). The latter can be fixed by using a negative lookahead assertion:<br />
<pre><code>keyword(Keyword) -->
(['a', 'u', 't', 'o'] -> {Keyword = "auto"};
['b', 'r', 'e', 'a', 'k'] -> {Keyword = "break"};
</code><i>etc.</i><code>),
not (identifier_nondigit(_) -> []; digit(_)).</code></pre>In other words, a keyword must <em>not</em> be followed immediately by one of the characters which is valid in a keyword or identifier (defined later). Note that this won’t fix partial keyword matching, since if the assertion fails the parse as a keyword will be rejected entirely. Either we should move the negative lookahead assertion into the bodies of the branch, or simply rearrange the offending keywords from longest to shortest. Since “<kbd>do</kbd>” and “<kbd>double</kbd>” are the only offenders, it’s easy enough to swap them. (We’ll encounter a better example later.)<br />
<br />
(Note: In the completed lexer linked below, I add a little extra machinery to allow listing keywords as strings in a list. There is nothing PEG-specific about this technique, and it saves me only keystrokes.)<br />
<br />
<h4>Identifiers, or, transforming left-recursive productions</h4><br />
Identifiers take a recursive form commonly seen in the C99 spec:<br />
<pre><var>identifier</var>:
<var>identifier-nondigit</var>
<var>identifier</var> <var>identifier-nondigit</var>
<var>identifier</var> <var>digit</var></pre>That is, an identifier is a certain initial production followed by a sequence of either of two other productions. A direct implementation is no good:<br />
<pre><code>identifier --> % WRONG
identifier_nondigit --> [];
(identifier, identifier_nondigit) --> [];
(identifier, digit).</code></pre>This will either match a single <code>identifier_nondigit</code>, or, failing to do so, will get caught in a loop calling itself again having consumed no input. Not what we wanted. We need to split the initial call to <code>identifier_nondigit</code> out, and write a right-recursive rule for the rest of the identifier:<br />
<pre><code>identifier --> identifier_nondigit, identifier_rec.
identifier_rec -->
(identifier_nondigit, identifier_rec) -> [];
(digit, identifier_rec) -> [];
[].</code></pre>Because <var>identifier-nondigit</var> and <var>digit</var> are disjoint, order is not important. This formulation is a proper PEG and will behave as expected.<br />
<br />
Of course, we determined earlier that it was important to us to know <em>what</em> identifier was parsed. We can easily add expressions to our rules to do just this:<br />
<pre><code>identifier(from_char_list(Ident): string) -->:
identifier_nondigit(C), identifier_rec(Ident0),
{Ident = [C | Ident0]}.
identifier_rec(Ident) -->
(identifier_nondigit(C), identifier_rec(Ident0)) ->
{Ident = [C | Ident0]};
(digit(D), identifier_rec(Ident0)) ->
{Ident = [D | Ident0]};
{Ident = []}.
identifier_nondigit(D) -->
nondigit(D0) -> {D = D0};
universal_character_name(D0), {D = D0}.</code></pre>(The typecast to <code>string</code> in the head of <code>identifier</code> is necessary because <code>from_char_list</code> is overloaded and Mercury gets confused otherwise.)<br />
<br />
<var>identifier-nondigit</var>, <var>nondigit</var>, and <var>digit</var> are straightforward and use the techniques described thus far. I’m going to skip over them and move onto constants.<br />
<br />
<h4>Constants, or, fixing choice using order</h4><br />
C99 defines constants as:<br />
<pre><var>constant</var>:
<var>integer-constant</var>
<var>floating-constant</var>
<var>enumeration-constant</var>
<var>character-constant</var></pre>Now, the astute reader will notice that something will go wrong here: an integer literal is also a valid floating-point literal in C99, so if we wrote this production directly as a PEG, it would always match the integer part of a floating-point literal as an integer. Again, not what we want.<br />
<br />
We could use a negative lookahead assertion as above to fix this, but easier still is to change the order, prioritizing floats about integers:<br />
<pre><code>constant -->
floating_constant -> [];
integer_constant -> [];
enumeration_constant -> [];
character_constant.</code></pre>Easy enough! Since enumeration constants must always start with a letter (which floats and integers do not), and character constants must always start with a single quote, we are all set here.<br />
<br />
The rest of the lexer for constants and string literals uses one of these techniques so let’s move on to the last lexing token, punctuators.<br />
<br />
<h4>Punctuators, or, fixing choice by ordering by length</h4><br />
<var>punctuator</var>, like <var>keyword</var>, is merely a list of strings of symbols which are meaningful in C99. This list presents the same problem that the list of keywords presents. Namely, that a shorter punctuator may match the prefix of what is in fact a longer punctuator. (For example, <kbd><</kbd> may match the symbol <kbd><<=</kbd> in source text.)<br />
<br />
For this production, we can use the solution of ordering the punctuators from longest to shortest. This works without much further thought. (Again, I use some extra code to allow me to define this production as a list of strings in my example code.) And that about does it for the lexer.<br />
<pre><code>:- end module lexer.</code></pre><h3>The Parser</h3><br />
The parser is, largely, trivial, using the same four techniques described above for the lexer (rewriting left-recursion and the three ordering techniques). Since the grammar makes lots of reference to individual kinds of tokens, I defined a few shortcuts:<br />
<pre>kw(KW) --> token(keyword(KW)).
id(Ident) --> token(identifier(Ident)).
const --> token(constant).
str_lit --> token(string_literal).
p(Punct) --> token(punctuator(Punct)).</pre>The grammar also references the <var>enumeration-constant</var> production, which, although not a token directly, is merely an identifier and we can treat it as such:<br />
<pre><code>enumeration_const --> id(_).</code></pre>Ensuring that all calls to the lexer are funneled through <code>token</code> will make our lives easier when we optimize later.<br />
<br />
Now before I sign off, there are two <em>gotcha</em>s in the grammar, for both of which C is infamous.<br />
<br />
<h4>The “dangling else”</h4><br />
C’s if/then/[else] has the <a rel="external" href="http://en.wikipedia.org/wiki/Dangling_else">problem</a> that in a CFG, it is ambiguous. Does <code>if (1) if (0) puts("hi"); else puts("bye");</code> print <samp>bye</samp> or nothing? The C99 standard indicates that an <code>else</code> should match the nearest <code>if</code> it legally can (so the above code would print <samp>bye</samp>).<br />
<br />
Fortunately with PEGs, this ambiguity is trivial to solve. We need simply write the if/then/else case first in the grammar. Any <code>else</code> will be matched by the innermost (and thus nearest) <code>if</code> possible:<br />
<pre><code>selection_statement -->
(kw("if"), p("("), expression, p(")"), statement,
kw("else"), statement) -> [];
(kw("if"), p("("), expression, p(")"), statement) -> [];
</code><i>etc.</i></pre><h4>Typedefs</h4><br />
For the unfamiliar, a <defn>typedef</defn> in C is a means of defining an alias for an existing type specification. There’s nothing odd about that, except that <em>types in C are keywords and the set of types determines the parse</em>. In other words, <em>C is not context-free</em>.<br />
<br />
Even though PEGs can directly represent some non-context-free grammars, C is not one of them. Fortunately the problem is easily solved by parameterizing the grammar and passing around a set of “typedefed” identifiers:<br />
<pre><code>:- import bool, set.
:- type typedefs == set(string).</code></pre>Typedefs are created in <var>declaration</var>s, if one of the <var>declaration-specifier</var>s is a <var>storage-class-specifier</var> which is the keyword <code>typedef</code>:<br />
<pre><code>declaration(!Typedefs) -->
declaration_specifiers(!.Typedefs, HasTypedef),
(init_declarator_list(Decls0) ->
{Decls = Decls0}; {Decls = init}),
p(";"), {if HasTypedef = yes then union(Decls, !Typedefs) else true}.
declaration_specifiers(Typedefs, HasTypedef) -->
(storage_class_specifier(IsTypedef),
declaration_specifiers_opt(Typedefs, HasTypedef0)) ->
{HasTypedef: bool = IsTypedef `or` HasTypedef0};
(type_specifier,
declaration_specifiers_opt(Typedefs, HasTypedef0)) ->
{HasTypedef = HasTypedef0};
</code><i>etc.</i>
<code>
declaration_specifiers_opt(Typedefs, HasTypedef) -->
declaration_specifiers(Typedefs, HasTypedef0) ->
{HasTypedef = HasTypedef0};
{HasTypedef = no}.
storage_class_specifier(IsTypedef) -->
kw("typedef") -> {IsTypedef = yes};
kw("extern") -> {IsTypedef = no};
kw("static") -> {IsTypedef = no};
kw("auto") -> {IsTypedef = no};
kw("register"), {IsTypedef = no}.</code></pre>(Note the use of state variable syntax, and the factored-out <code>declaration_specifiers_opt</code>, which makes the code passably legible.) Typedef declarations may legally exist in <var>block-item</var>s, scoped within <var>compound-statements</var>:<br />
<pre><code>compound_statement(Typedefs) -->
p("{"), (block_item_list(Typedefs, _) -> []; []), p("}").
block_item_list(!Typedefs) -->
(block_item(!Typedefs), block_item_list(!Typedefs)) -> [];
block_item(!Typedefs).
block_item(!Typedefs) --> declaration(!Typedefs) -> []; statement.</code></pre>and as <var>external-declaration</var>s:<br />
<pre><code>translation_unit(!Typedefs) -->
(external_declaration(!Typedefs), translation_unit(!Typedefs)) -> [];
external_declaration(!Typedefs).
external_declaration(!Typedefs) -->
function_definition -> [];
declaration(!Typedefs).</code></pre>Finally, <var>typedef-name</var> itself references the set of typedefs:<br />
<pre><code>typedef_name(Typedefs) --> id(Ident), {set.member(Ident, Typedefs)}.</code></pre>The rest of the implementation of typedefs involves parameterizing most productions to pass the set of typedefs from <code>external_declaration</code> down to <code>typedef_name</code>, and parameterizing the few illegal uses of typedefs (for example, in function signatures) to simply drop the modified set of typedefs. Nothing brain-bending; just a lot of typing.<br />
<br />
<h3>Summary & Test</h3><br />
In today’s post we saw how to translate a real language grammar, that of C99, into a Mercury program. We learned how to transform left-recursive rules into right-recursive ones, how to ensure proper production choice by using negative lookahead assertions, ordering by precedence, and ordering by length, and how to implement the non-context-free elements of C99.<br />
<br />
Let’s add a main function similar to yesterday’s, compile, and try out our spiffy new parser:<br />
<pre><samp>$ </samp><kbd>./c_parser_direct</kbd>
<kbd>int main(void) {}</kbd></pre>…and wait, and wait, and wait, and… what’s going on? Why are we getting no response? Do we have an infinite loop?<br />
<br />
Nope. Add this line in the <code>lexer</code> module:<br />
<pre><code>:- pragma memo(token/3, [fast_loose]).</code></pre><pre><samp>$ </samp><kbd>./c_parser_direct</kbd>
<kbd>int main(void) {}</kbd></pre>(wait for it…)<br />
<pre><samp>Success!</samp></pre>Success indeed, but it’s still slow. More on memoization and how to speed this up even more in part 3.<br />
<br />
<a href="http://www.fstutoring.com/~chris/aim/c_parser_direct.m">Download c_parser_direct.m</a>Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com0tag:blogger.com,1999:blog-1560635160713980237.post-58916698356449770902011-10-02T20:32:00.000-07:002011-10-02T20:32:20.612-07:00Parsing Expression Grammars part 1First, I’d like to say welcome back! AiM is returning from its hiatus since April. The summer has been busy for me, but I hope to update AiM weekly henceforth. Let’s start!<br />
<br />
<h3>Parsing Expression Grammars</h3><br />
<a rel="external" href="http://en.wikipedia.org/wiki/Parsing_expression_grammar">Parsing expression grammars</a> (<abbr>PEG</abbr>s) are a new-ish formal grammar intended to describe computer languages. PEGs differ from the more commonplace <a rel="external" href="http://en.wikipedia.org/wiki/Context-free_grammar">context-free grammars</a> most notably in that PEGs offer an <defn>ordered choice</defn> operator. The second branch of an ordered choice implicitly <em>does not match</em> any expression which is matched by the first branch. Since any text can only match exactly one branch of an ordered choice, this has the practical implication that <em>a PEG cannot be ambiguous</em> and rule precedence declarations are not needed.<br />
<br />
Another natural consequence of ordered choice is that so-called repetition operators (<code>*</code> and <code>+</code>) and the option operator (<code>?</code>) are necessarily greedy. (Any time a grammar rule <em>can</em> recur, ordered choice ensures that it <em>does</em> recur, since the option of not recurring will implicitly not match.)<br />
<br />
A PEG rule, as described in the Wikipedia entry, may be either:<br />
<ul><li>a terminal (a symbol),</li>
<li>a nonterminal (another rule), or</li>
<li>the empty string.</li>
</ul>or may be constructed from other rules by:<br />
<ul><li>sequence: <code><var>A</var> <var>B</var></code></li>
<li>ordered choice: <code><var>A</var> / <var>B</var></code></li>
<li>zero or more: <code><var>A</var>*</code></li>
<li>one or more: <code><var>A</var>+</code></li>
<li>optional: <code><var>A</var>?</code></li>
<li>positive lookahead: <code>&<var>A</var></code></li>
<li>negative lookahead: <code>!<var>A</var></code></li>
</ul>Positive lookahead and negative lookahead test for a rule match (or lack thereof) but do not consume input.<br />
<br />
<h3>Translating into Mercury</h3><br />
It’s well-known that CFGs can be implemented directly in logic languages such as Mercury as recursive-descent parsers. Prolog and Mercury even have a special <a rel="external" href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_ref/DCG_002drules.html">syntax</a> known as <a rel="external" href="http://en.wikipedia.org/wiki/Definite_clause_grammar">definite clause grammar</a> (<abbr>DCG</abbr>) to facilitate doing so. It’s similarly easy to implement PEGs, and <em>there’s a surprise at the end!</em>. Let’s see how.<br />
<br />
PEG rules can be represented using DCG clauses like so:<br />
<br />
<code><var>rule</var> --> <var>body</var></code><br />
<br />
The type and mode of rules can be inferred by Mercury, but if you need to declare them, you can do so as:<br />
<br />
<code>:- pred <var>rule</var>(list(char)::in, list(char)::out) is semidet.</code><br />
<br />
The elements of the body of a PEG are translated as follows:<br />
<dl><dt>terminal <var>C</var></dt>
<dd><code>[<var>C</var>]</code></dd>
<dt>nonterminal <var>A</var></dt>
<dd><code><var>A</var></code></dd>
<dt>empty string</dt>
<dd><code>[]</code></dd>
<dt>sequence <code><var>A</var> <var>B</var></code></dt>
<dd><code><var>A</var>, <var>B</var></code></dd>
<dt>ordered choice <code><var>A</var> / <var>B</var></code></dt>
<dd><code><var>A</var> -> []; <var>B</var></code><br/>
Note that using <code>-> [] ;</code> (if/else) instead of <code>;</code> gives us exactly the greedy semantics we require.</dd>
<dt>optional <code><var>A</var>?</code></dt>
<dd><code><var>A</var> -> []; []</code></dd>
<dt>positive lookahead <code>&<var>A</var></code></dt>
<dd><code>not not <var>A</var></code></dd>
<dd>or equivalently, <code>=(S), <var>A</var>, :=(S)</code></dd>
<dt>negative lookahead <code>!<var>A</var></code></dt>
<dd><code>not <var>A</var></code></dd>
</dl><br />
Repetition constructions (<code>A*</code> and <code>A+</code>) are recursive in nature and cannot be represented inline. However we can define higher order rules to represent them as:<br />
<pre>:- mode *(pred(in, out) is semidet, in, out) is det.
*(P) --> (P, *(P)) -> []; [].
:- mode +(pred(in, out) is semidet, in, out) is semidet.
+(P) --> (P, +(P)) -> []; P.</pre>Then we can translate repetition constructions as:<br />
<dl><dt>zero or more: <code><var>A</var>*</code></dt>
<dd><code>*(<var>A</var>)</code> or <code>*(pred(in, out) is semidet --> <var>A</var>)</code></dd>
<dt>one or more: <code><var>A</var>+</code></dt>
<dd><code>+(<var>A</var>)</code> or <code>+(pred(in, out) is semidet --> <var>A</var>)</code></dd>
</dl>The first, shorter form, can only be used for named rules with defined modes. The longer form is applicable to any rule construct.<br />
<br />
<h3>Example</h3><br />
Let’s translate the <a rel="external" href="http://en.wikipedia.org/wiki/Parsing_expression_grammar#Examples">example</a> of basic arithmetic expressions from Wikipedia:<br />
<pre>Value ← [0-9]+ / '(' Expr ')'
Product ← Value (('*' / '/') Value)*
Sum ← Product (('+' / '-') Product)*
Expr ← Sum</pre>First let’s get some boilerplate out of the way. We’ll need the <code>list</code> module for DCG-syntax terminals:<br />
<pre>:- module peg_example.
:- interface.
:- import_module io.
:- pred main(io::di, io::uo) is det.
:- implementation.
:- import_module list.</pre>Now onto <code>Value</code>. Right off the bat Wikipedia throws us a curveball with the range expression <code>[0-9]</code>. Fortunately that’s easily translated as a specially-moded function:<br />
<pre>:- mode in..in = in is semidet.
A..B = C :- A @=< C, C @=< B.</pre>Even though (in fact, because) <code>..</code> is a <a rel="external" href="http://www.mercury.csse.unimelb.edu.au/information/doc-latest/mercury_ref/Builtin-Operators.html">builtin operator</a>, we can use it in definitions as well.<br />
<br />
Armed with our new range operator, we can translate the <code>Value</code> rule as:<br />
<pre>value -->
+(pred(in, out) is semidet --> ['0'..'9']) -> [];
(['('], expr, [')']).</pre>Note that the lambda-predicate (<code>pred(in, out) is semidet --></code>) is needed since <code>['0'..'9']</code> is not a predicate on its own. The translation is otherwise straightforward.<br />
<br />
<code>Product</code>, <code>Sum</code>, and <code>Expr</code> follow similarly:<br />
<pre>product -->
value, *(pred(in, out) is semidet -->
((['*'] -> []; ['/']), value)).
sum -->
product, *(pred(in, out) is semidet -->
((['+'] -> []; ['-']), value)).
expr --> sum.</pre>Our <code>main</code> function will read a line, try to parse it using <code>expr</code> and make sure that only the newline at the end is left unparsed:<br />
<pre>main(!IO) :-
read_line(Res, !IO),
(if Res = ok(Line) then
(if expr(Line, ['\n'])
then print("Success!\n", !IO)
else print("Failure!\n", !IO)),
main(!IO)
else true).</pre>Compile with <code>mmc --infer-all --make peg_example</code>, run it and try out some expressions, keeping in mind that our grammar doesn’t allow negatives, decimals, or whitespace:<br />
<pre><samp>$ <kbd>./peg_example</kbd>
<kbd>123+456</kbd>
Success!
<kbd>5/(10-20)</kbd>
Success!
<kbd>45-(40</kbd>
Failure!
<kbd>*12</kbd>
Failure!
<kbd>a+18</kbd>
Failure!</samp></pre>Of course, a parser that merely tests whether input is well-formed is only mildly useful. Next time, we’ll look into creating parse trees, and the real-life example of parsing C code.<br />
<br />
<a href="http://www.fstutoring.com/~chris/aim/peg_example.m">Download peg_example.m</a>Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com4tag:blogger.com,1999:blog-1560635160713980237.post-84163953341199444772011-04-04T20:06:00.000-07:002011-10-02T20:37:47.319-07:00Dependent typing: reversing non-empty listsThis is the first post in a series on <a href="http://en.wikipedia.org/wiki/Dependent_type">dependent typing</a>. For those not familiar with dependent typing, it is the ability to describe data using types which say something about the particular value of the data itself – the type of the data <em>depends</em> on the value of the data. For example, the phrase “even integers” could describe a dependent type which is a subtype of integers. Whether a number belongs to that type <em>depends</em> on whether its value is evenly divisible by 2. (Note that any subtype can be viewed as a simple kind of dependent type.)<br />
<br />
The obvious use of dependent typing is to describe invariants of your program, for example, “a non-NULL pointer”, “an array of length 5”, or “a non-empty list”. While full-blown dependent typing requires a full-blown theorem prover (such as <a href="http://coq.inria.fr/">Coq</a> or <a href="http://www.ats-lang.org/">ATS/LF</a>), many mainstream languages support basic dependent types. Java can infer whether a value is not null, OCaml can talk about types which are made up of only certain constructors, Mozilla’s new language <a href="https://github.com/graydon/rust/wiki">Rust</a> can attach arbitrary predicates to types, and I would be remiss not to mention Mercury.<br />
<br />
Mercury has limited but useful support for dependent typing via its <a href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_ref/Modes.html">mode system</a>. The mode system appears complicated, so let’s start with a brief overview.<br />
<br />
<h3>The mode system</h3><br />
Each predicate (or function) in Mercury <em>does something</em> to its arguments. The most common possibilities are to look at an argument and leave it be (this would be an input argument), or to attempt to bind it to a value (this would be an output argument). Another possibility is that it destroys its argument after looking at it (this happens with the IO state). These possible actions are called <defn>modes</defn>.<br />
<br />
Modes are described simply by the expected state of the argument before the call, and the guaranteed state of the argument after the call using the syntax <code><var>Before</var> >> <var>After</var></code>. These states (called <defn>instantiatednesses</defn> or <defn>insts</defn> for short) may be one of <code>free</code> (the argument is unbound), <code>ground</code> or <code>bound</code> (the argument is bound to a value), <code>clobbered</code> (the argument is destroyed or overwritten), and <code>unique</code> (the argument is bound to a value which is not referenced anywhere else). The latter two are used mainly for IO state and arrays, so we will focus on <code>free</code> and <code>bound</code>.<br />
<br />
The modes we are all familiar with, <code>in</code> and <code>out</code>, are defined in the module <a href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_library/builtin.html"><code>builtin</code></a> as <code>ground >> ground</code> and <code>free >> ground</code>, respectively. In other words, an <code>in</code> argument is expected to be bound to a value before the call and is left that way after, and an <code>out</code> argument may be unbound before the call, but will be bound after.<br />
<br />
The magic is that we may tell Mercury that we expect or guarantee an argument to be bound <em>only to certain constructors</em>! To do this, we must first define the instantiatedness in which we are interested, say non-empty lists:<br />
<pre>:- inst non_empty_list == bound([ground | ground]).</pre>This defines the <code>non_empty_list</code> inst as a variable bound to a list cons cell, the head and tail of which are both bound. We could similarly define the inst of an empty list as:<br />
<pre>:- inst empty_list == bound([]).</pre>In other words, an <code>empty_list</code> is a variable which is bound to the empty list constructor.<br />
<br />
Note that there is also a shorthand which allows the above two insts to be written similarly to type definitions:<br />
<pre>:- inst non_empty_list ---> [ground | ground].
:- inst empty_list ---> [].</pre>We could then define the mode of an input argument which is expected to be a non-empty list as:<br />
<pre>:- mode in_non_empty_list == non_empty_list >> non_empty_list.</pre>In other words, an argument with this mode would be expected to be bound to a non-empty list, and would remain bound that way afterward. Similarly:<br />
<pre>:- mode out_non_empty_list == free >> non_empty_list.</pre>An argument with this mode may be unbound before the call, but is guaranteed to be bound to a non-empty list after. Mercury actually has the mode-functions <code>in()</code> and <code>out()</code> already defined, so we may write <code>in(non_empty_list)</code> or <code>out(non_empty_list)</code> to describe these same modes.<br />
<br />
<h3>Example</h3><br />
<em>Note</em>: due to <a href="http://bugs.mercury.csse.unimelb.edu.au/view.php?id=191">bug 191</a>, the following example won’t trigger the proper error messages from the compiler as of this post. <del>Download and apply <a href="http://bugs.mercury.csse.unimelb.edu.au/file_download.php?file_id=119&type=bug">this patch</a> against the compiler to fix the issue</del>, or work around the bug by replacing the two <code>out</code> modes with <code>out(list(ground))</code>, which won’t change the meaning but will avoid triggering the bug. <ins>UPDATE: parts of the standard library won’t compile <em>without</em> the bug, so don’t go applying that patch just yet.</ins><br />
<br />
On a recent <a rel="external" href="http://news.ycombinator.com/item?id=2405137">thread</a> at Hacker News, a question was brought up about proving that <code>[X | Xs] = reverse([Y | Ys])</code> was total (or in Mercury terms, <code>det</code>). Let’s see if we can do this in Mercury using our new tools. First let’s define <code>reverse2</code> (named so not to conflict with built-in <code>list.reverse</code>) using a recursive auxiliary function with an accumulator:<br />
<pre>:- import_module list.
:- func reverse2(list(A)) = list(A).
:- mode reverse2(in) = out.
reverse2(A) = reverse2_aux(A, []).
:- func reverse2_aux(list(A), list(A)) = list(A).
:- mode reverse2_aux(in, in) = out.
reverse2_aux([], Acc) = Acc.
reverse2_aux([A | As], Acc) = reverse2_aux(As, [A | Acc]).</pre>(I have included the mode declarations as we will be adding to them later.) Testing this function in our <code>main</code> predicate is simple:<br />
<pre>main(!IO) :-
Y = 1, Ys = [2,3,4,5],
[X | Xs] = reverse2([Y | Ys]),
print({X, Xs}, !IO), nl(!IO).</pre>A sharp eye will spot the problem before we even compile: the deconstruction <code>[X | Xs] =</code> can fail, which would cause <code>main</code> to be <code>semidet</code> instead of <code>det</code> (that is, a partial function). The compiler spots this as well:<br />
<pre>In `main'(di, uo):
error: determinism declaration not satisfied.
Declared `det', inferred `semidet'.
Unification with `list.[X | Xs]' can fail.</pre>Of course this isn’t true. What the compiler would love to see is a mode declaration stating that if <code>reverse2</code> is passed a non-empty list, then it will return a non-empty list. Let’s write that:<br />
<pre>:- mode reverse2(in(non_empty_list)) = out(non_empty_list).</pre>Now we get a different error:<br />
<pre>In clause for `reverse2(in((list.non_empty_list))) =
out((list.non_empty_list))':
mode error: argument 2 did not get sufficiently instantiated.
Final instantiatedness of `HeadVar__2' was `ground',
expected final instantiatedness was `bound(list.'[|]'(ground,
ground))'.</pre>(Note: if you didn’t/can’t apply the above compiler patch, as a workaround, replace <code>out</code> in the original mode declarations with <code>out(list(ground))</code>.)<br />
<br />
Here, <code>HeadVar__2</code> refers to the function result (what would be the second variable in the head were <code>reverse2</code> a predicate). The compiler expected the function result to be non-empty (bound to <code>list.[|]</code>) but it was bound to an unknown value of type list. Of course this is because <code>reverse2_aux</code> doesn't guarantee a non-empty result on non-empty input, so let’s add a declaration stating so as well:<br />
<pre>:- mode reverse2_aux(in(non_empty_list), in) = out(non_empty_list).</pre>(Note that we don’t say anything about the accumulator in this declaration.) Unfortunately recompiling gives us a similar error, this time complaining about the second (the recursive) clause of <code>reverse2_aux</code>. Of course – if our initial list had only one item, the recursive call would pass <code>reverse2_aux</code> an <em>empty</em> list, triggering the first clause, and we haven’t made (and can’t make) any special guarantees about that case. Let's take a closer look at that clause:<br />
<pre>reverse2_aux([], Acc) = Acc.</pre>The only way that this clause will return a non-empty list is if the accumulator, <code>Acc</code>, is non-empty. Fortunately, it is called recursively with a non-empty accumulator (since elements are prepended on every recurrence), so this should be enough to convince the compiler. Let’s add a mode declaration stating this fact:<br />
<pre>:- mode reverse2_aux(in, in(non_empty_list)) = out(non_empty_list).</pre>Success! Compilation is successful, meaning <code>main</code> has been proved to be <code>det</code> (a total function), and running the program gives the expected output:<br />
<pre>{5, [4, 3, 2, 1]}</pre>Using Mercury’s mode system, we were able to prove that reversing a non-empty list results in a non-empty list, by adding descriptive mode declarations in strategic locations.<br />
<br />
As an exercise, add an inst describing a list of at least two elements and add modes to <code>reverse2</code> and <code>reverse2_aux</code> which make guarantees about lists of at least two elements.<br />
<br />
<a href="http://www.fstutoring.com/~chris/aim/reverse.m">Download reverse.m</a>Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com4tag:blogger.com,1999:blog-1560635160713980237.post-11449310126933797132011-04-02T19:22:00.000-07:002011-04-02T19:24:43.715-07:00Enabling float unboxing on 64-bit systemsIn many high-level languages, it’s desirable that all data be represented with the same size word (usually a system word, 32- or 64-bits) so that polymorphic functions need not perform run-time dispatch, nor be type-specialized at compile time, to work with data of different sizes. This is usually done by “boxing” non-word-size data by storing it in the heap and passing around a word-sized pointer to the data.<br />
<br />
Of course for “small” data boxing entails the performance overhead of an extra pointer indirection. (For “large” data boxing is usually a performance win since it reduces the amount of data which must be copied; this is also why in C structures are usually passed by reference.) For this reason anything smaller than or equal in size to a word will usually be stored unboxed. On 32-bit systems, this means characters, integers, enumerations, and single-precision (32-bit) floating-point numbers, but <em>not</em> double-precision (64-bit) floating-point numbers. On 64-bit systems, all of the above including doubles can be stored unboxed.<br />
<br />
Mercury follows these same conventions, which does not bode well for scientific computations on 32-bit systems, since all doubles will be boxed. (This is actually special-cased for arrays and structures in OCaml.)<br />
<br />
Fortunately Mercury does store doubles unboxed on 64-bit systems, with one gotcha: <em>Double-precision floats will only be stored unboxed if the compiler is built with an existing compiler, and not bootstrapped!</em> This means that after you first install Mercury, you must <code>make realclean</code>, ensure that <code>mmc</code> and friends are in your <code>PATH</code>, and repeat the installation (including <code>./configure</code>). You can check that <code>configure</code> enabled float unboxing with <code>grep unboxed config.log</code>; you will see messages indicating its status.<br />
<br />
If you are doing any sort of scientific calculations on a 64-bit machine it’s important to ensure that this optimization is enabled; it can easily double the performance of any float-heavy computations.Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com0tag:blogger.com,1999:blog-1560635160713980237.post-80037324222486918582011-04-02T07:15:00.000-07:002011-10-17T17:51:23.870-07:00Benchmarks Game: spectral-normThe <a href="http://shootout.alioth.debian.org/">Computer Language Benchmarks Game</a> measures the performance of programming language implementations over a set of algorithms. Mercury is known for being fast, so let’s put it to the test. We will compare it to four high-performing language implementations, C (gcc), Java (Sun), Haskell (GHC), and OCaml, on a 64-bit four-way processor (Intel Atom D510).<br />
<br />
<h3>Problem description</h3>Today we will develop a solution to the <a href="http://shootout.alioth.debian.org/u64q/performance.php?test=spectralnorm">spectral-norm</a> test, whose description reads:<br />
<br />
<blockquote>Calculate an eigenvalue using the <a href="http://en.wikipedia.org/wiki/Power_iteration">power method</a>.</blockquote><br />
Eigenvectors are vectors which, when multiplied by a matrix, produce a multiple of themselves. The eigenvalue of this eigenvector is this multiple. The power method involves repeatedly multiplying a random vector by the matrix in question until the vector converges. This is then an eigenvector, and its associated eigenvalue is found easily.<br />
<br />
To compute the <a href="http://mathworld.wolfram.com/SpectralNorm.html">spectral norm</a> of a matrix, one simply computes the square root of the maximum eigenvalue (which coincidentally is the one found by the power method).<br />
<br />
<h3>Implementation</h3>Let’s take the design of the <a href="http://shootout.alioth.debian.org/u64q/program.php?test=spectralnorm&lang=java&id=1">single-threaded Java entry</a> as a template. This way, we can ensure that we are using the same algorithm as the other implementations, which is a requirement of the Benchmarks Game.<br />
<br />
The interface and <code>main</code> predicate are unremarkable:<br />
<br />
<pre>:- module spectralnorm.
:- interface.
:- import_module io.
:- pred main(io::di, io::uo) is det.
:- implementation.
:- import_module float, int, list, math, std_util, string.
main(!IO) :-
command_line_arguments(Args, !IO),
(if to_int(head(Args), N0) then N = N0 else N = 100),
format("%.9f\n", [f(approximate(N))], !IO).</pre><br />
Note that we have imported <code>list</code>, for use in representing the vectors. One may think that using arrays would perform much better but surprisingly the gain is negligible in this program (mostly because there are no random accesses), and would come at the cost of more complicated code.<br />
<br />
The <code>approximate</code> function works by iterating multiplication 20 times, and using that result and the second-to-last to compute the spectral norm. This function is easily written using higher-order functions; in particular, <code>list.foldl</code> and <code>list.map_corresponding</code> are used to compute the dot product of two vectors:<br />
<br />
<pre>approximate(N) = sqrt(VBv / Vv) :-
V = pow(multiplyAtAv(N), 19, unit(N)),
U = multiplyAtAv(N, V),
VBv = foldl((+), map_corresponding((*), U, V), 0.0),
Vv = foldl((+), map_corresponding((*), V, V), 0.0).</pre><br />
The unit vector, used as the initial vector, is easily defined recursively:<br />
<br />
<pre>unit(N) = (if N = 0 then [] else [1.0 | unit(N - 1)]).</pre><br />
Next we define a function <code>a</code> to represent the matrix we are using:<br />
<br />
<pre>a(I, J) = 1.0 / float((I + J) * (I + J + 1) / 2 + I + 1).</pre><br />
Next we define multiplication of the matrix by a vector by using two recursive functions. <code>multiplyAvI</code> multiplies each row <code>I</code> < <code>N</code> of the matrix by the vector <code>V</code> by calling <code>multiplyAvIJ</code>, which multiplies and sums each element of a row. <code>multiplyAvI</code> simply starts the process.<br />
<br />
<pre>multiplyAv(N, V) = multiplyAvI(N, 0, V).
multiplyAvI(N, I, V) = YYs :-
if I < N then
Y = multiplyAvIJ(I, 0, V),
Ys = multiplyAvI(N, I + 1, V),
YYs = [Y|Ys]
else YYs = [].
multiplyAvIJ(_, _, []) = 0.0.
multiplyAvIJ(I, J, [X|V]) = a(I, J) * X + multiplyAvIJ(I, J + 1, V).</pre><br />
To multiply by the transpose of A, we similarly define <code>multiplyAtv</code>:<br />
<br />
<pre>multiplyAtv(N, V) = multiplyAtvI(N, 0, V).
multiplyAtvI(N, I, V) = YYs :-
if I < N then
Y = multiplyAtvIJ(I, 0, V),
Ys = multiplyAtvI(N, I + 1, V),
YYs = [Y|Ys]
else YYs = [].
multiplyAtvIJ(_, _, []) = 0.0.
multiplyAtvIJ(I, J, [X|V]) = a(J, I) * X + multiplyAtvIJ(I, J + 1, V).</pre>Finally, <code>multiplyAtAv</code> ties these together. Note we must declare its type, since we partially apply it in <code>approximate</code>.<br />
<br />
<pre>:- func multiplyAtAv(int, list(float)) = list(float).
multiplyAtAv(N, V) = multiplyAtv(N, multiplyAv(N, V)).</pre>Compiling with <code>mmc --infer-all -O6 --intermod-opt --rebuild spectralnorm</code>, we obtain a runtime of about 38.1 seconds for <var>N</var> = 3000. Not too bad, considering that the highly optimized <a href="http://shootout.alioth.debian.org/u64q/program.php?test=spectralnorm&lang=gcc&id=4">implementation</a> has a single-threaded runtime of about 26.5 seconds. But we can do better.<br />
<br />
<h3>Optimization</h3>Compiling with <code>-p</code> and running <code>mprof</code> after running our program tells us that the majority of our time (99%) is spent in <code>multiplyAtIJ</code> and <code>multiplyAtvIJ</code>, as expected. So let's focus our efforts there, and in <code>a</code>, which apparently has been inlined into each of those functions, since it does not otherwise appear in the profile.<br />
<br />
<h4>Unchecked arithmetic</h4>The most obvious speedup is to replace the floating-point division in <code>a</code> with <code>`unchecked_quotient`</code>, which doesn't check if its right argument is zero. (We make sure to add parentheses as appropriate, since infix functions bind very tightly.) Recompiling, this reduces our runtime to 29.8 seconds! Similarly, replacing the integer division by two with <code>`unchecked_right_shift` 1</code> reduces the runtime to 28.4 seconds.<br />
<br />
<h4>Tail-call optimization / accumulator introduction</h4>Next, both <code>multiplyAvIJ</code> and <code>multiplyAtvIJ</code> are written in a fashion which precludes tail-call optimization: they perform an operation on the result of the recursive call, which prevents Mercury from turning them into simple loops. Fortunately, we can direct <code>mmc</code> to transform these functions into accumulator-passing functions (which can be optimized) by using the <code>--introduce-accumulators</code>, and telling Mercury that floating-point addition is associative:<br />
<br />
<pre>:- promise all [A, B, C, ABC]
(A + B) + C = ABC: float <=> A + (B + C) = ABC: float.</pre>Unfortunately, due to <a href="http://bugs.mercury.csse.unimelb.edu.au/view.php?id=193">bug 193</a>, <code>mmc</code> doesn't recognize this declaration due to the typecast (which is necessary to distinguish between integer- and floating-point addition), so we must instead introduce a dummy function <code>fplus</code> to disambiguate:<br />
<br />
<pre>:- func fplus(float, float) = float.
A `fplus` B = A + B.
:- promise all [A, B, C, ABC]
(A `fplus` B) `fplus` C = ABC <=> A `fplus` (B `fplus` C) = ABC.</pre>After replacing our use of <code>+</code> with <code>`fplus`</code> in <code>multiplyAvIJ</code> and <code>multiplyAtvIJ</code> (and again, parenthesizing appropriately), we cut our runtime down to 25.2 seconds -- better than the C implementation!<br />
<br />
An optimization along similar lines is to enable <code>--optimize-constructor-last-call</code>, which can enable tail-call optimization in functions like <code>unit</code>, <code>multiplyAvI</code>, and <code>multiplyAtvI</code>, which construct a list after their recursive call. While this is a win for smaller values of <var>N</var>, it actually degrades performance for larger values so we will not enable it.<br />
<br />
<h4>Parallelization</h4>Comparing our 25.2 seconds to the C implementation’s 26.5 seconds is not really fair, since the C implementation is parallelized, running in 6.9 seconds across four CPUs.<br />
<br />
Fortunately parallelization in Mercury is simple. The <code>&</code> operator solves two goals in parallel, expecting them both to succeed (like <code>,</code>). Our job is to decide where is the best place to place this operator.<br />
<br />
At the top level, operations in <code>approximate</code> and <code>multiplyAtAv</code> are necessarily sequential, since the result of each successive multiplication depends on the result of the previous.<br />
<br />
Moving down a level, the two operations in <code>multiplyAtI</code> and <code>multiplyAtvI</code> seem parallelizable, as they do not depend on one another. Let’s introduce a parallel goal:<br />
<br />
<pre>multiplyAvI(N, I, V) = YYs :-
if I < N then
(Y = multiplyAvIJ(I, 0, V) &
Ys = multiplyAvI(N, I + 1, V)),
YYs = [Y|Ys]
else YYs = [].</pre><code>multiplyAtvI</code> is modified similarly. Note the placement of the parentheses: <code>&</code> binds more loosely than <code>,</code>. If we did not have parentheses, the list construction would be moved into the second branch, making it dependent on the first branch, and causing (for reasons unknown to the author) massive amounts of memory to be used.<br />
<br />
At the lowest level, we could introduce parallel goals in <code>multiplyAvIJ</code> and <code>multiplyAtvIJ</code>, but we should have sufficient parallelization already, and the operations at this level are so short that they would be dwarfed by parallelization overhead.<br />
<br />
Compiling with <code>mmc --parallel --infer-all -O6 --intermod-opt --introduce-accumulators --rebuild spectralnorm</code>, we obtain a runtime of 7.2 seconds – less than 5% slower than the C counterpart. The difference between Mercury and C is even less pronounced at <var>N</var> = 5500, coming in at 23.8 and 23.7 seconds respectively.<br />
<br />
<h4>Results</h4>Results obtained using the Benchmark Game’s <code>bencher.py</code> program for <var>N</var> = 5500:<br />
<br />
<table><thead>
<tr> <th>×</th> <th>Language</th> <th>CPU (s)</th> <th>Elapsed (s)</th> <th>Memory (kB)</th> <th>Code (B)</th> <th>CPU load</th> </tr>
</thead> <tbody>
<tr> <td style="text-align: right">1.0</td> <th style="text-align: left"><a href="http://shootout.alioth.debian.org/u64q/program.php?test=spectralnorm&lang=java&id=2">java</a></th> <td style="text-align: right">87.24</td> <td style="text-align: right"><b>23.36</b></td> <td style="text-align: right">15617</td> <td style="text-align: right">1227</td> <td style="text-align: right">385%</td> </tr>
<tr> <td style="text-align: right">1.0</td> <th style="text-align: left"><a href="http://shootout.alioth.debian.org/u64q/program.php?test=spectralnorm&lang=ocaml&id=3">ocaml</a></th> <td style="text-align: right">88.94</td> <td style="text-align: right"><b>23.73</b></td> <td style="text-align: right">4037</td> <td style="text-align: right">1065</td> <td style="text-align: right">391%</td> </tr>
<tr> <td style="text-align: right">1.0</td> <th style="text-align: left"><a href="http://shootout.alioth.debian.org/u64q/program.php?test=spectralnorm&lang=gcc&id=4">gcc</a></th> <td style="text-align: right">89.98</td> <td style="text-align: right"><b>23.74</b></td> <td style="text-align: right">763</td> <td style="text-align: right">1992</td> <td style="text-align: right">399%</td> </tr>
<tr> <td style="text-align: right">1.0</td> <th style="text-align: left"><a href="http://www.fstutoring.com/~chris/shootout/spectralnorm.m">mmc --parallel</a></th> <td style="text-align: right">90.85</td> <td style="text-align: right"><b>23.83</b></td> <td style="text-align: right">54974</td> <td style="text-align: right">722</td> <td style="text-align: right">394%</td> </tr>
<tr> <td style="text-align: right">1.3</td> <th style="text-align: left"><a href="http://shootout.alioth.debian.org/u64q/program.php?test=spectralnorm&lang=ghc&id=4">ghc</a></th> <td style="text-align: right">95.45</td> <td style="text-align: right"><b>29.27</b></td> <td style="text-align: right">2496</td> <td style="text-align: right">1346</td> <td style="text-align: right">346%</td> </tr>
<tr> <td style="text-align: right">3.1</td> <th style="text-align: left"><a href="http://www.fstutoring.com/~chris/shootout/spectralnorm.m">mmc -H</a></th> <td style="text-align: right">70.06</td> <td style="text-align: right"><b>70.09</b></td> <td style="text-align: right">7118</td> <td style="text-align: right">722</td> <td style="text-align: right">116%</td> </tr>
<tr> <td style="text-align: right">3.7</td> <th style="text-align: left"><a href="http://www.fstutoring.com/~chris/shootout/spectralnorm.m">mmc</a></th> <td style="text-align: right">84.96</td> <td style="text-align: right"><b>84.98</b></td> <td style="text-align: right">40751</td> <td style="text-align: right">722</td> <td style="text-align: right">112%</td> </tr>
<tr> <td style="text-align: right">3.9</td> <th style="text-align: left"><a href="http://www.fstutoring.com/~chris/shootout/spectralnorm.m">mmc --java</a></th> <td style="text-align: right">88.86</td> <td style="text-align: right"><b>88.21</b></td> <td style="text-align: right">33007</td> <td style="text-align: right">722</td> <td style="text-align: right">113%</td> </tr>
</tbody> </table><br />
Curiously, the compiling with the <code>-H</code> flag (to enable high-level C “hlc” grade) generates much faster single-threaded code. Checking the generated C code in <code>Mercury/cs/spectralnorm.c</code> indicates that tail calls are implemented using loops in this version, compared to calls to a Mercury primitive in the other versions. Unfortunately the high-level C grade does not support the and-parallelism we are using.<br />
<br />
<a href="http://www.fstutoring.com/~chris/shootout/spectralnorm.m">Download spectralnorm.m</a>Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com8tag:blogger.com,1999:blog-1560635160713980237.post-62766478517236333882011-03-28T14:17:00.000-07:002011-04-03T19:33:15.243-07:00Optimization Levels<code>-O</code> optimization levels run from -1 to 6. Which options are enabled in these levels don’t seem to be documented in the <a href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_user_guide/Overall-optimization-options.html">User’s Guide</a> or the <code>mmc(1)</code> man page. They are however documented in the compiler source code. <code>-O2</code> is the default.<br />
<br />
<b>Note:</b> <em>Order matters!</em> For example, <code>--introduce-accumulators</code> followed by <code>-O6</code> will actually <em>disable</em> <code>--introduce-accumulators</code>, because it is disabled in <code>-O6</code>. For this reason you should put <code>-O</code> options first on the command line.<br />
<br />
<h3><code>-O-1</code></h3>Disables all optimizations which can be disabled.<br />
<code><ul><li>--no-common-data</li>
<li>--no-llds-optimize</li>
<li>--no-optimize-jumps</li>
<li>--no-optimize-labels</li>
<li>--no-optimize-peep</li>
<li>--no-smart-indexing</li>
<li>--no-static-ground-cells</li>
</ul></code><br />
<h3><code>-O0</code></h3>Minimizes compilation time.<br />
<code><ul><li>--excess-assign</li>
<li>--optimize-dead-procs</li>
<li>--optimize-repeat 1</li>
<li>--no-c-optimize</li>
<li>--no-emit-c-loops</li>
<li>--no-middle-rec</li>
<li>--no-optimize-delay-slot</li>
<li>--no-optimize-frames</li>
<li>--no-optimize-tailcalls</li>
<li>--no-use-local-vars</li>
</ul></code><br />
<h3><code>-O1</code></h3>Cheap optimizations with good payoff.<br />
<code><ul><li>--no-common-struct</li>
<li>--no-follow-code</li>
<li>--no-inline-simple</li>
<li>--no-inline-single-use</li>
<li>--no-optimize-fulljumps</li>
<li>--no-optimize-initializations</li>
<li>--no-simple-neg</li>
</ul></code><br />
<h3><code>-O2</code></h3>Optimizations with good payoff.<br />
<code><ul><li>--inline-compound-threshold 10</li>
<li>--optimize-dups</li>
<li>--optimize-repeat 1</li>
<li>--user-guided-type-specialization</li>
</ul></code><br />
<h3><code>-O3</code></h3>Slow optimizations with good payoff.<br />
<code><ul><li>--constraint-propagation</li>
<li>--deforestation</li>
<li>--local-constraint-propagation</li>
<li>--optimize-higher-order</li>
<li>--optimize-reassign</li>
<li>--optimize-repeat 4</li>
<li>--optimize-saved-vars</li>
<li>--optimize-unused-args</li>
</ul></code><br />
<h3><code>-O4</code></h3>Slow optimizations.<br />
<code><ul><li>--higher-order-size-limit 30</li>
<li>--inline-compound-threshold 20</li>
<li>--inline-simple-threshold 8</li>
</ul></code><br />
<h3><code>-O5</code></h3>Very slow optimizations.<br />
<code><ul><li>--delay-constructs</li>
<li>--eliminate-local-vars</li>
<li>--higher-order-size-limit 40</li>
<li>--inline-compound-threshold 100</li>
<li>--loop-invariants</li>
<li>--optimize-repeat 5</li>
</ul></code><br />
<h3><code>-O6</code></h3>Unreasonably slow optimizations.<br />
<code><ul><li>--everything-in-one-c-function</li>
<li>--inline-alloc</li>
<li>--use-macro-for-redo-fail</li>
</ul></code><br />
<h3><code>--optimize-space</code></h3>Optimize for code size rather than speed. This can be used in conjunction with one of the other levels, keeping in mind the aforementioned rules of ordering.<br />
<code><ul><li>--optimize-dead-procs</li>
<li>--optimize-dups</li>
<li>--optimize-proc-dups</li>
<li>--optimize-reassign</li>
<li>--unneeded-code-copy-limit 1</li>
<li>--no-optimize-fulljumps</li>
</ul></code><br />
Additionally, <code>--optimize-space</code> turns off the following options if set:<br />
<code><ul><li>--inline-alloc</li>
<li>--loop-invariants</li>
<li>--use-macro-for-redo-fail</li>
<li>--no-optimize-labels</li>
</ul></code><br />
<h3>Optimizations never enabled</h3>These can make code run slower.<br />
<ul><li><code>--checked-nondet-tailcalls</code></li>
<li><code>--constraint-propagation</code></li>
<li><code>--introduce-accumulators</code> (buggy with trailed updates)</li>
<li><code>--optimize-constructor-last-call</code></li>
<li><code>--optimize-duplicate-calls</code> (buggy with array.init)</li>
<li><code>--type-specialization</code></li>
<li><code>--unneeded-code</code></li>
</ul><br />
Additionally, intermodule optimizations are never enabled by <code>-O</code> because they require special handling by <code>mmc --make</code>.Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com0tag:blogger.com,1999:blog-1560635160713980237.post-32139663384465958822011-03-25T12:16:00.000-07:002011-03-26T20:30:35.385-07:00Project Euler problem 2<a href="http://projecteuler.net/index.php?section=problems&id=2">Project Euler problem 2</a> asks:<br /><br /><blockquote>By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.</blockquote><br />Again we start with main module boilerplate:<br /><br /><pre>:- module p2.<br />:- interface.<br />:- import_module io.<br />:- pred main(io::di, io::uo) is det.</pre><br />In our implementation, we will again use the <a href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_library/int.html"><code>int</code></a> module. Because the range of terms to sum is not definite (we do not know how many terms are less than four million without prior analysis), <a href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_library/solutions.html"><code>solutions</code></a> will not be useful to us. (Note that even the <code>solutions.do_while</code> predicate cannot be relied upon, as the order in which terms are generated is not guaranteed, and we have no efficient way of knowing if we’ve found all the terms of interest.)<br /><br /><pre>:- implementation.<br />:- import_module int.</pre><br />First let’s define the recursive function <code>fibonacci</code> of one argument to compute the <code>I</code>th Fibonacci number.<br /><br /><pre>:- func fibonacci(int) = int.<br />:- pragma memo(fibonacci/1).<br />fibonacci(I) =<br /> (if I = 0 then 1<br /> else if I = 1 then 2<br /> else fibonacci(I - 1) + fibonacci(I - 2)).</pre><br />Normally a function so defined would be exponential with <code>I</code>. However, by instructing the Mercury compiler to <a href="http://en.wikipedia.org/wiki/Memoization">memoize</a> <code>fibonacci</code> via the <a href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_ref/Tabled-evaluation.html"><code>memo</code></a> pragma, we achieve linear performance as if we had used a <a href="http://en.wikipedia.org/wiki/Dynamic_programming">dynamic programming</a> algorithm. (Note that memoization is not yet available in all compilation grades, including Java.)<br /><br />To perform the summation, we can define a recursive function which recurs on successive values of <code>I</code> until <code>fibonacci(I)</code> is not less than four million:<br /><br /><pre>sum_fibonaccis(I) =<br /> (if fibonacci(I) @ Fibo < 4000000 then<br /> sum_fibonaccis(I + 1) +<br /> (if even(Fibo) then Fibo else 0)<br /> else 0).</pre><br />The <code>@</code> function is used to bind <code>fibonacci(I)</code> to <code>Fibo</code> in an expression context (akin to C’s <code>=</code> operator). The built-in predicate <code>int.even</code> is used to test for evenness.<br /><br />Those readers familiar with functional programming will likely notice that this function is not tail-recursive; i.e., the result of the recursive call is not returned directly. This will cause unnecessary (though, in this case, benign) stack growth. The typical solution is to rewrite <code>fibonacci</code> with an accumulator argument:<br /><br /><pre>sum_fibonaccis_acc(I, Sum) =<br /> (if fibonacci(I) @ Fibo < 4000000 then<br /> sum_fibonaccis_acc(I + 1, Sum +<br /> (if even(Fibo) then Fibo else 0))<br /> else Sum).</pre><br />Mercury however is able to automatically perform this transformation, provided we inform it that <code>(Fibo0 + Fibo1 + ...) + FiboN</code> is indeed the same as <code>Fibo0 + (Fibo1 + ... + FiboN)</code> – that is, that addition is associative:<br /><br /><pre>:- promise all [A, B, C, ABC] (A + B) + C = ABC <=> A + (B + C) = ABC.</pre><br />This optimization is enabled with the <code>--introduce-accumulators</code> option to <code>mmc</code>. The compiler unfortunately can recognize only a narrow set of functions and predicates as candidates for accumulator introduction. In particular, altering our definition of <code>sum_fibonaccis</code> to bind <code>Fibo = fibonacci(I)</code> in a separate body instead of using <code>@</code>, or moving the innermost <code>if</code> to outside the recursive call, causes the compiler not to recognize this function as a candidate for accumulator introduction.<br /><br />The remainder of our implementation is unremarkable:<br /><br /><pre>p2 = sum_fibonaccis(0).<br />main --> print(p2), nl.</pre><br />Compiling the program with <code>mmc --infer-all --introduce-accumulators --make p2</code> produces a large warning concerning possible performance problems due to the accumulator introduction altering the order of summation. This is obviously not a problem for addition and can be safely ignored; the warning is meant for operations such as linked list concatenation, the performance of which depends on the value of one operand more than that of the other. If desired this warning can be disabled with the <code>--inhibit-accumulator-warnings</code> option.<br /><br />The resulting program runs in about 100 ms on my dual-core 1.6 GHz machine.<br /><br /><a href="http://fstutoring.com/~chris/euler/p2.m">Download p2.m</a>Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com0tag:blogger.com,1999:blog-1560635160713980237.post-1043668549734858402011-03-24T11:59:00.000-07:002011-11-28T21:15:07.161-08:00Project Euler problem 1<a href="http://projecteuler.net/index.php?section=problems&id=1">Project Euler problem 1</a> asks:<br />
<br />
<blockquote>Add all the natural numbers below one thousand that are multiples of 3 or 5.</blockquote><br />
We start by declaring the module and defining its interface, which is standard boilerplate for the main module of a program:<br />
<br />
<pre>:- module p1.
:- interface.
:- import_module io.
:- pred main(io::di, io::uo) is det.</pre><br />
In our implementation, we will utilize the <a href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_library/int.html"><code>int</code></a> and <a href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_library/solutions.html"><code>solutions</code></a> modules from the standard library:<br />
<br />
<pre>:- implementation.
:- import_module int, solutions.</pre><br />
The integers provided by <code>int</code> are guaranteed to be at least 32-bit and so will be large enough for this problem. <code>solutions</code> provides the <code>aggregate</code> function which we will use to sum the results.<br />
<br />
First let’s define the numbers we’re interested in by defining a predicate, aptly named <code>multiple_of_3_or_5_below_1000</code>. It is a predicate of one argument, which, when used as an output (<code>::out</code>), produces zero or more results (<code>is nondet</code>).<br />
<br />
<pre>:- pred multiple_of_3_or_5_below_1000(int::out) is nondet.</pre><br />
Implementing this predicate is easy using the builtin <code>nondet_int_in_range</code> predicate to generate integers, and <code>rem</code> to check that the remainder is zero when divided by 3 or 5:<br />
<br />
<pre>multiple_of_3_or_5_below_1000(I) :-
nondet_int_in_range(1, 999, I),
(I rem 3 = 0; I rem 5 = 0).</pre><br />
We can then define the problem solution <code>p1</code> as the sum of the multiples using builtins <code>aggregate</code> and <code>plus</code>:<br />
<br />
<pre>p1 = aggregate(multiple_of_3_or_5_below_1000, plus, 0).</pre><br />
The type of <code>p1</code> can be inferred by the compiler as a function of zero arguments returning an <code>int</code>.<br />
<br />
Because <code>plus</code> is associative and commutative, we could use the <code>unsorted_aggregate</code> predicate instead to avoid the overhead of sorting the solutions first (since the order doesn’t matter). However, it turns out that the overhead is negligible in this program, so we can stick with the <code>aggregate</code> function for the sake of clarity.<br />
<br />
Lastly, print the result. The IO state is threaded using <a href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_ref/DCG_002dgoals.html">DCG notation</a>.<br />
<br />
<pre>main --> print(p1), nl.</pre><br />
We can then build the program using <code>mmc --infer-all --make p1</code>. The resulting program runs in about 100 ms on my dual-core 1.6 GHz machine.<br />
<br />
<a href="http://fstutoring.com/~chris/euler/p1.m">Download p1.m</a>Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com1tag:blogger.com,1999:blog-1560635160713980237.post-14686663459401969942011-03-24T11:44:00.000-07:002011-03-26T20:23:00.855-07:00IntroductionHello there! This blog will be about exploring the <a href="http://www.mercury.csse.unimelb.edu.au/">Mercury</a> programming language. Mercury is a declarative logic language <a href="http://www.mercury.csse.unimelb.edu.au/information/doc-release/mercury_trans_guide/index.html">like Prolog</a>, but is purely functional and strongly typed <a href="http://www.mercury.csse.unimelb.edu.au/information/comparison_with_haskell.html">like Haskell</a>.<br /><br />I do not intend this blog to be an introduction to the language unto itself; the <a href="http://www.mercury.csse.unimelb.edu.au/information/documentation.html">Mercury documentation</a> covers that fairly well. I rather intend it to be a companion to these introductory materials, or as a “bird’s eye view” of the capabilities of Mercury for those interested in learning it further.<br /><br />Of course I will be happy to answer any questions about the language posed in the comments; I have been using Mercury on-and-off for a couple years and can probably field most basic questions.<br /><br />My first series of posts will cover Mercury solutions to the great <a href="http://projecteuler.net/">Project Euler</a> problems, a collection of numeric problems with often non-trivial solutions.<br /><br />With that, let’s begin!Chris Pacejohttp://www.blogger.com/profile/18278436935541738068noreply@blogger.com0