Last modified 3/27/95.
Since there may be any (finite) number of such productions for each variable A in V, P maps V into 2^(V U T)*, subsets of (V U T)*. The function P is often thought of as a set of individual productions of the form A -> alpha, where A is in V and alpha is in (V U T)*. In that case, it is said that A -> alpha is a production of P.
It is useful to talk about the height of a derivation tree rooted at a variable A that derives a sentential form alpha. For this, we superscript the => symbol with the number of steps in parentheses:
NN = N = 0 REPEAT N = NN NN = N U {A in V | P(A) ^ N* <> 0} UNTIL (N = NN) RETURN NThe algorithm adds variables that can derive a null string in one or more steps (depending on the iteration at which the variable is added) and halts when a fixed point is reached. We now use this to eliminate e-productions from a grammar. This can be shown graphically.
P2 = P1 = P N = FindNull(G) REPEAT P1 = P2 FOREACH (A in V) P2(A) = P1(A) U {alpha1 alpha2 | (alpha1 B alpha2) in P1(A) and B in N} UNTIL (P1 = P2) FOREACH (A in V) P1(A) = P1(A) - {e,A} RETURN
NAD = {B | B in P(A)} REPEAT AD = NAD NAD = AD U {B in V | Exists C in AD s.t. B in P(C)} - {A}. UNTIL (AD = NAD) RETURN ADThe following algorithm assumes there are no e-productions in G.
FOREACH (B in V) AD(B) = Find-A-Deriv(B,G) FOREACH (A in V, B in AD(A)) P1(A) = P(A) U P(B) FOREACH (A in V) P1(A) = P1(A) - AD(A) RETURNWhen P1(A) is first created, it includes all the right-hand sides for every non-e-production for any variable B that A can derive. Finally, all the unit productions are removed. Note that, provided the appropriate minor modifications are made, these two steps may be done in either order!
NL = {A | P(A) ^ T* <> 0 } REPEAT L = NL NL = L U {B in V | P(B) ^ (T U L)* <> 0 } UNTIL (L = NL) RETURN LOn iteration i, the algorithm adds those variables that take at least i steps to derive a string of terminals.
NR = {S} REPEAT R = NR NR = R U {A in V | Exists B in R s.t. P(B) ^ (VUT)*A(VUT)* <> 0 } UNTIL (R = NR) RETURN ROn iteration i, the algorithm adds those variables that only appear in sentential forms that take least i steps to derive from S.
V' = FindLive(G) FOREACH (A in V1) P'(A) = P(A) ^ (V' u T)* G' =Only reachable and live variables can be useful (although these alone are not sufficient!), so the algorithm restricts attention to them. Only productions that involve live and reachable variables can contribute to L(G), so the rest are pruned. A variable may be both live and reachable, but may appear only in sentential forms with non-live variables, and so be useless. For this reason, it is important that the grammar G' with only live variables be produced first, since the pruning operation on P will eliminate all productions involving dead variables. If a live, reachable variable can only appear in a sentential form with non-live variables, then it will not be reachable in the pruned grammar, since the necessary step that produced the non-live variable(s) with which it must appear is no longer available.V1 = FindReachable(G') FOREACH (A in V1) P1(A) = P(A) ^ (V1 u T)* RETURN
Determine the pumping lemma constant N for G in GNF, then run all possible derivations with N to 2N steps. This will generate all strings in L with length n, N <= n <= 2N. L is infinite if and only if there is at least one such string.
Again, determine the pumping lemma constant N for G in GNF, then run all possible derivations of length up to N. L is empty if and only if there are no strings of length less than N.
For this, using a grammar in GNF suggests an algorithm that parses one input character at a time, and keeps track of all sentential forms that have a prefix of the input as a prefix of the sentential form consisting of all of the its terminal symbols.
The CYK algorithm takes a grammar in CNF, and using dynamic programming parses the string bottom-up. It may also be used to determine the number of distinct derivation trees in G for a particular string.