The previous examples show that XSB can process grammars efficiently, using them to determine language membership and the structure of strings. But XSB can also do other kinds of grammar processing. In the following, we use XSB to compute the FIRST sets of a grammar.
, for any string of terminals and nonterminals , is defined to be the set of terminals that begin strings derived from , and if derives the empty string then the empty string is also in . is the generalization to length k strings of terminals that are prefixes of strings derived from .
We will assume that a grammar is stored in a predicate
==>/2
, with the head of a rule as an atomic symbol and the body
of a rule as a list of symbols. Nonterminals are assumed to be those
symbols for which there is at least one rule with it as its head.
(==>/2
is declared as an infix operator.)
The predicate first(SF,K,L)
is true if the list SF of grammar
symbols derives a string whose first K terminal symbols are L.
% The definition of FIRST: % first(SentForm,K,FirstList) computes firsts for a context-free grammar. :- table first/3. first(_,0,[]). first([],K,[]) :- K>0. first([S|R],K,L) :- K>0, (S ==> B), first(B,K,L1), length(L1,K1), Kr is K - K1, first(R,Kr,L2), append(L1,L2,L). first([S|R],K,[S|L]) :- K>0, \+ (S ==> _), % S is a terminal K1 is K-1, first(R,K1,L).
The first rule says that the empty string is in
for
any .
The second rule says that the empty string is in
for
being the empty sequence of grammar
symbols. The third rule handles the case in which the sequence of
grammar rules begins with a nonterminal, S
. It takes a rule
beginning with S and gets a string L1
(of length K1
K
) generated by the body of that rule. It gets the
remaining K
-K1
symbols from the rest of the list of
input grammar symbols. And the fourth rule handles terminal symbols.
This is a relatively simple declarative (and constructive) definition of FIRSTk. But without the table declaration it would not run for many grammars. Any left-recursive grammar would cause this definition to loop in Prolog.