next up previous contents
Next: Representing the Input String Up: Grammars Previous: Grammars

An Expression Grammar

Consider the following expression grammar, expressed as a DCG. This is the ``natural'' grammar one would like to write for this langauge.

% file grammar.P
:- table expr/2, term/2.

expr --> expr, [+], term.
expr --> term.
term --> term, [*], primary.
term --> primary.
primary --> ['('], expr, [')'].
primary --> [Int], {integer(Int)}.
This grammar is left-recursive and would cause a problem for Prolog, but with the table declarations, XSB handles it correctly. Notice that we must table expr/2 because the XSB parser adds 2 arguments to the nonterminal symbol expr. (An alternative would be to use the auto_table directive.) After compiling and loading this program, we can execute it:
| ?- [grammar].
[Compiling ./grammar]
[grammar compiled, cpu time used: 0.419 seconds]
[grammar loaded]

yes
| ?- expr([1,+,2,*,3,*,'(',4,+,5,')'],[]).
Removing open tables.

yes
| ?-
Here the system answers ``yes'' indicating that the string is recognized. (The message ``Removing open tables'' is given when a query with no variables evaluates to true. It indicates that once the first successful execution path is found, computation terminates, and any tables that are not fully evaluated have been deleted.) This grammar is not only more elegant than the one we wrote in Prolog for the same langauge, it is more ``correct''. What I mean by that is that this grammar associates ``+'' and ``*'' to the left, as is usual, rather than to the right as the did the Prolog grammar that we gave in an earlier chapter.

So far we've only seen DCG's that represent simple context-free grammars. The DCG representation is actually much more powerful and can be used to represent very complicated systems. As a first example, let's say we want to implement an evaluator of these integer expressions. We can add semantic arguments to the nonterminals above to contain the value of the subexpression recognized. The following grammar does this:

% file grammar.P
:- table expr/3, term/3.

expr(Val) --> expr(Eval), [+], term(Tval), {Val is Eval+Tval}.
expr(Val) --> term(Val).
term(Val) --> term(Tval), [*], primary(Fval), {Val is Tval*Fval}.
term(Val) --> primary(Val).
primary(Val) --> ['('], expr(Val), [')'].
primary(Int) --> [Int], {integer(Int)}.
Recall that the braces are used to indicate that the enclosed calls are calls to Prolog predicates, not to nonterminals which recognize parts of the input string.

We can compile and run this program to evaluate the expression provided and have it return its integer value as follows:

| ?- [grammar].
[Compiling ./grammar]
[grammar compiled, cpu time used: 0.75 seconds]
[grammar loaded]

yes
| ?- expr(Val,[1,+,2,*,3,*,'(',4,+,5,')'],[]).

Val = 55;

no
| ?-
Notice that the string arguments are added after any explicit arguments. So we wrote what looked like a procedure definition for expr that had only one argument, but we get a definition that has 3 arguments, and that's the way we called it at the top-level prompt.

As mentioned, this grammar treats the operators as left associative, which is the usual convention for arithmetic expressions. While it doesn't matter semantically for the operations of ``+'' and ``*'', which are associative operators anyway, were we to extend this evaluator to handle ``-'' or ``/'' (as is very easy), then this correct associativity would be critical.


next up previous contents
Next: Representing the Input String Up: Grammars Previous: Grammars
David S. Warren
1999-07-31