Eleventh Lecture -- Minimizing dfa

Wednesday, October 17

If a dfa is specified by its table (or graph), it may not be in its most useful form. In particular, if it has been obtained 'mechanically' (e.g. by the subset construction, or from a regular expression), chances are that it has useless states.

We have not proven that there is a unique automaton with a minimal number of states (this will be a consequence of the Nerode-Myhill characterization), but we will present ways to reduce unnecessary states.

There are several reasons why states are redundant:

  1. Unreachable states
    states p such that there is no string x with deltaHat(q0, x)=p
  2. States from which an accepting states are unreachable
    states p such that there is no string x with deltaHat(p, x) in F.
  3. Duplicates: states p and q such that every string that leads from p to an accepting state, leads from q to an accepting state, and conversely. We can "collapse" these states to a single one.

The last category is the only interesting one, since we can eliminate the useless states in the first two categories by standard graph algorithms (use depth first search to find all states accessible from q0, for all these use depth first search to eliminate the second category.)

Lemma Given automata M1 and M2, there are algorithms to decide whether

The first is simply the graph accessibility problem: is there a path from q0 to an accepting state, which can be solved by depth-first search. The same algorithm, applied to the complementary automaton, solves the third problem.

To test equality, remember that L1=L2 iff (L2-L1) and L1-L2 are both empty, and regular languages are closed under difference.

The last is left as an exercise.

Two states p, and q are not equivalent if there is some string x such that exactly one of deltaHat(p,x) and deltaHat(q,x) is in F. For a state p, let
Xp={y | deltaHat(p,y) in F}
Xp is the language that M would accept if p were the initial state. The condition above is equivalent to Xp not equal to Xq, which can be tested using the Lemma above.

Actually, there is a much better algorithm. See Kozen for description and proof of correctness.

Kozen also has nice worked-out exampes of minimization.