Twelvth Lecture: Nerode-Myhill Theorem

Friday, October 19

The smallest dfa accepting a regular language

We will prove that the minimal dfa obtained by the minimization procedure is in fact the minimum: that is, for every regular language R there is a unique smallest automaton M that accepts it, i.e. R=L(M). Moreover M can be obtained from any dfa M' with R=L(M') by the minimization procedure.

First, we will have to state this precisely. Then, we will get the result by giving an abstract characterization of regular languages through algebra, (the Nerode-Myhill Theorem) which will characterize the minimum dfa.

Note that we have to qualify "unique". Clearly, we can rename the states of M, obtaining a new automaton that has the same number of states and also accepts L. To formally exclude these trivial case, we define "unique up to isomorphism".

Recall that a homomorphism between two structures is a mapping that "preserves structure." So a homomorphism from automaton
M1=(Q1, SIGMA, delta1, init1, F1) to M2=(Q2, SIGMA, delta2, init2, F2)
is a function from Q1 to Q2 such that

f(init1)=init2
if q is in F1, f(q) is in F2
for all a in SIGMA, and all q in Q1, f(delta1(q),a)= delta2(f(q),a)

f is an isomorphism if f is a homomorphism form M1 to M2 such that it has an inverse, f^{-1}, that is a homomorphism form M2 to M1. Observe that an isomorphism is just a renaming of states.

Equivalence Relations

Let S be any set. A relation ~ over S (subset of SxS) is an equivalence relation if it is

Reflexive -- for all x in S x~x
Symmetric -- for all x,y in S, if x~y then y~x
Transitive -- for all x,y,z in S, if x~y and y~z then x~z

The equivalence class of x, is defined as [x]={y|y~x}
Equivalence classes partition S. The collection of equivalence classes, denoted S/~ is called "the quotient of S by the equivalence relation".

Example: Let S=Z, the set of integers, and let x~y be the relation x mod 6 = y mod 6. (Exercise: prove that for any integer k, the relation xRy defined by x mod k = y mod k is an equivalence relation.)
Z/~ has six equivalence classes: [0], [1], [2], [3], [4], and [5]. Of course [0]=[6]=[12]=[-36] is the set of integers that are multiples of 6, [1] is the set of integers that have residue 1 mod 6, etc.

We can define the operations of + and * among equivalence classes by:
[x]+[y]=[x+y], and [x]*[y]=[x*y]
Of course, we have to verify that these definitions make sense, that is, if we take x1~x2, y1~y2, then if we compute [x1] + [y1] as [x1 + y1] or as [x2 + y2] we get the same result. (Exercise: prove that + and * are well defined.)

Now consider the set Z6={0,1,2,3,4,5}, with the operations +6 and *6 defined as
i +6 j = i+j(mod 6)
i *6 j = i*j(mod 6)
Exercise: Show that Z/~ and Z6 are isomorphic.

Read definitions of refinement, coarsest equivalence relation, in Kozen.

Consider Z2 and Z3, defined analogously to Z6 with 2 and 3 respectively in the place of 6. Note that the equivalence classes of Z under the equivalence relation x~2y iff x mod2 = y mod 2 are the union of equivalence classes of the mod 6 relation.

Now we are ready to read Kozen.

The overall plan is the following: first, given a dfa M accepting regular set R, define an equivalence relation = on SIGMA* by
x=M y iff deltaHat(q0, x)=deltaHat(q0,y)
This equivalence relation 'respects' concatenation in the sense that
for all z x =M y implies xz =M yz
This property is called 'right congruence'.
In particular =M refines the equivalence relation =R defined as
x =R y iff (x is in R iff y is in R)
An equivalence relation on SIGMA* that

is a right congruence
has finite index (finitely many equivalence classes), and
refines =R

is called a Nerode-Myhill relation for R.

So we prove:

if R is regular there is a Nerode-Myhill relation for it
if a language R has a Nerode-Myhill relation, it is regular.

The second one has the magical proof:
consider M having as states the equivalence classes of the Nerode-Myhill relation.
The initial state is [epsilon] the equivalence class of the empty string.
The transition function is given by delta([x],a)=[xa]
A state [x] is final iff x is in R

The construction works. (Proof in Kozen.)

Janos Simon