COT 6315 - Proof 11.2

COT6315 Formal Languages and Theory of Computation

R. E. Newman-Wolfe, University of Florida

Last modified 2/24/95.

Definition: A-Derivable Variable

A variable B in V of grammer G = in CFG is A-derivable for a variable A in V iff A =>* B, or equivalently, iff

B is in P(A) or
C is A-derivable, B is in P(C), and B <> A.
No other variables are A-derivable.

The first condition states that A => B directly, while the second applies the transitive closure to obtain all A-derivable variables. As in the e-production case, we first give an algorithm to find all A-derivable variables, then use it in an algorithm to eliminate all unit productions. Note that the initialization step does not include all of P(A), which consists of strings, but only the variables corresponding to singleton strings in P(A).

Algorithm 11.2a: Find-A-Deriv(A,G)

    NAD = {B | B in P(A)}
    REPEAT
       AD = NAD
       NAD = AD U {B in V | Exists C in AD s.t. B in P(C)} - {A}.
    UNTIL (AD = NAD)
    RETURN AD

The following algorithm assumes there are no e-productions in G.

Algorithm 11.2: Elim-Unit-Prod(G)

    FOREACH (A in V) AD(A) = Find-A-Deriv(A,G)
    FOREACH (A in V, B in AD(A))
        P1(A) = P(A) U P(B)
    FOREACH (A in V)
        P1(A) = P1(A) - AD(A)
    RETURN

Theorem 11.2: Equivalence of Unit-Productionless Grammar

G in CFG, G has no e-productions and G1 = Elim-Unit-Prod(G) => L(G) - {e} = L(G1) and G1 has no unit productions.

Proof

That G1 has no unit productions is obvious from the last step of the algorithm. That L(G1) is contained in L(G) should be clear, since A =>_G1 alpha iff either A =>_G alpha or A =>*_G B =>_G alpha by construction. Thus any derivation S =>* x in G1 has a corresponding, possibly longer, derivation in G.

That L(G) - {e} is contained in L(G1) may be shown by proving that every sentential form alpha derivable from S in G for which the last step was not a unit production may also be derived from S in G1. Since any derivation of x in L(G) - {e} cannot end in a unit production (there are no variables in x), this will suffice to show L(G)-{e} contained in L(G1). We consider only leftmost derivations, since this allows us to force any succession of unit productions on a variable in the derivation to be applied one after another to the variable in the same position until a non-unit production is applied. We induct on the number of steps in the derivation, n.

Base: n=1

S =>_G alpha <=> alpha in P(S) => alpha in P1(S), since the last and only production cannot be a unit production.

IH: For some K>0, if S =>K_G alpha by a leftmost derivation, and the last step in the derivation is not a unit production, then S =>*_G1 alpha.

IS: Show that for K+1, if S =>K+1_G alpha by a leftmost derivation and the last step in the derivation is not a unit production, then S =>*_G1 alpha.

If S =>K+1_G gamma by a leftmost derivation and the last step in the derivation is not a unit production, then S =>K_G x.B.alpha =>_G x.beta.alpha, where gamma = x.beta.alpha, and beta in P(B). Now consider the last step before this that was not a unit production, if there was one. It must have produced a sentential form x.A.alpha, since we only consider leftmost derivations (A may or may not be B). Since A =>*_G B, beta in P1(A) by construction of P1, so A =>_G1 beta, and x.A.alpha =>_G1 x.beta.alpha. Since S =>_G x.A.alpha in fewer than K+1 steps, S =>*_G1 x.A.alpha by the IH, so S =>*_G1 x.A.alpha =>_G1 x.beta.alpha. On the other hand, if there were no preceding non-unit production in the derivation of x.beta.alpha, then x = alpha = e and S =>K_G B. In this case, beta in P1(S) by construction of P1, so S =>_G1 beta = gamma.

QED

Send comments to nemo@cis.ufl.edu