COT 6315 - Proof 9.2

COT6315 Formal Languages and Theory of Computation

R. E. Newman-Wolfe, University of Florida

Last modified 2/21/95.

Definition 9.3: Regular Grammar

A CFG, G, is a regular grammar iff every production is of one of the three forms:

B -> a
B -> aC
B -> e (empty string)

where B and C are variables in V and a is a terminal in T.

Theorem 9.2: Regular Languages and Regular Grammars

Any language L in T* is regular iff there is a regular grammar G such that L = L(G).

Proof

First note that any sentential form of a regular grammar can have at most one variable (we start with one variable, S, and each production either keeps the number of variables constant or reduces it by one). The idea is to let the variable in a sentential form correspond to the state of the FSA during its computation.

(=>)

Let L in RL. Then there is M = in FSA s.t. L = L(M). Construct G = in RG as follows.

V = Q,
S = q_0,
P(q) = { ap | d(q,a) = p } U { e | q in A }, for p,q in Q and a in T.

Claim 1: L(G) = L = L(M)

Let x = a1a2a3...ak for ai in T, and let xi = a1a2...ai for i=1,2,...,k. Let q_i = d*(q_0,xi). Then d(q_i-1, ai) = qi. Thus ai.q_i in P(q_i-1), and by easy induction, S =>i xi.q_i for i=1,2,...,k. If x in L = L(M), then d*(q_0, x) = d*(q_0, xk) = q_k in A. Therefore S =>k xk.q_k => xk, since q_k in A => e in P(q_k). Hence x in L(G).

On the other hand, if S =>i-1 w.q_i-1 => w.ai.q_i, then d(q_i-1, ai) = qi by the definition of P. By induction, w = a1a2...ai-1 = xi-1 as defined above, and S =>i xi.q_i. Therefore d*(q_0,xi) = q_i by induction, and d*(q_0,xk) = q_k. If S =>* x, then S =>* x.q_k => x and q_k must be in A or else e would not be in P(q_k). Thus d*(q_0,xk) = d*(q_0, x) = q_k in A, so x in L(M).

QED Claim 1.

(<=)

Let L = L(G) for G = in RG. Construct M = in NFSA-e as follows.

Q = V U {F},
q_0 = S,
A = {F},
d(A,a) = {B | aB in P(A)} U {F | a in P(A)} for A, B in V and a in T,
d(A,e) = {F | e in P(A)}.

Note that there are no transitions from state F, and that because of the possibility of e-productions, we had to include e-transitions in M.

Claim 2: L(G) = L = L(M)

Let x = a1a2a3...ak for ai in T, and let xi = a1a2...ai for i=1,2,...,k. If S =>i-1 w.A_i-1 => w.ai.A_i, then ai.A_i in P(A_i-1) and Ai in d(A_i-1, ai) by the definition of d. By induction, w = a1a2...ai-1 = xi-1 as defined above, and S =>i xi.A_i for i=1,2,...,k-1, and therefore A_i in d*(S,xi) by induction. Note that we cannot assert that this is true for i=k, since the kth step of the derivation may eliminate A_k-1 and be the last step. If S =>* x, then either S =>* x.A_k => x or S =>* xk-1.A_k-1 => x_k = x. If S =>* x.A_k => x then e must be in P(A_k), so F in d(A_k,e), and F in d~(d*(S, x),e) = d*(S, x) and x in L(M). If S =>* xk-1.A_k-1 => x_k-1.ak = x_k = x, then F in d(A_k-1,ak), and F in d~(d*(S,xk-1),ak) = d*(S,x), so x in L(M). In either case, x in L(M), so L(G) contained in L(M).

On the other hand, if F <> A_i in d(A_i-1, ai), then ai.A_i in P(A_i-1) by definition of d. If F <> A_i in d*(S,xi) then by induction, S =>i xi.A_i. If x in L(M), then F in d*(S,x) so either there is some A_k such that A_k in d*(S,x) and d(A_k,e) = {F}, so e in P(A_k), or there is some A_k-1 such that A_k-1 in d*(S,xk-1) and F in d(A_k-1,ak), so ak in P(A_k-1). If there is some A_k such that A_k in d*(S,x), then S =>* x.A_k => x, while if there is some A_k-1 such that A_k-1 in d*(S,xk-1) and F in d(A_k-1,ak), then S =>* xk-1.A_k-1 => xk-1.ak = x. In either case, x in L(G), so L(M) contained in L(G).

QED Claim 2.

QED

Send comments to nemo@cis.ufl.edu