Intersection Emptiness For Dfas: Can We Solve It Faster Than Quadratic Time?

Emptiness Testing for DFA Intersection

Determining if the language accepted by the intersection of two deterministic finite automata (DFA) is empty is a fundamental problem in automata theory and formal verification. The standard algorithm has a worst-case time complexity that is quadratic in the number of states of the DFAs. As problem sizes scale up, this quadratic bottleneck becomes prohibitively expensive. Researchers have sought faster solutions, developing advanced algorithms that leverage randomness and parallelism to achieve subquadratic and even near-linear time in many cases.

The Quadratic Bottleneck

The standard algorithm for DFA intersection emptiness testing works by taking the cross product construction of the input DFAs to explicitly build the intersection DFA, then testing this DFA for emptiness by reaching all reachable states via a depth-first or breadth-first graph search. The resulting worst-case complexity is O(n²) where n is the number of states over both input DFAs. This is because in the worst case, the intersected DFA can have O(n²) states. Thus exploring the full intersected DFA leads to the quadratic time bound.

The cross product construction that produces this quadratic blowup is simple and intuitive, allowing us to directly reason about the intersection language. However, for large input DFAs, most of the n² potential intersected states may be unreachable from initial states. So we waste efforts examining irrelevant states. The proliferated intersected states directly cause the runtime to suffer a quadratic increase.

Table of Contents

Improving Beyond Quadratic Time

If we can avoid producing all O(n²) cross product states explicitly, and only explore “relevant” parts of the intersected DFA without compromising correctness, then significant speedups are possible. Modern algorithms apply optimization strategies based on state space pruning, early exit short-circuiting upon finding a reachable accept state, and other reachability analysis techniques. The objective is to only expend efforts on truly reachable intersected states required for the emptiness decision.

For example, partial cross products avoid the full exponential state space explosion by only generating intersected states reachable under input symbols that lead to state pairs jointly present in the input DFAs. Other algorithms decompose the DFAs and test intersected fragments whose emptiness status can be equivalently composed. Innovative data structures such as tree encoded DFAs also curb exponential blowups. Overall the key ideas rely on reachability reasoning shortcuts to work with smaller effective problem sizes closer to the linear bound.

Subquadratic Randomized Methods

Randomization has enabled breakthrough subquadratic algorithms for DFA intersection emptiness. A seminal approach runs in expected O(n²/log n) time. The high level idea uses random terminal state identification to partition the DFAs into smaller blocks which intersect pair-wise independently. Emptiness testing then composes intersected block pairs, achieving savings from working with smaller fragments versus the monolithic DFAs.

Further refinements incorporate randomness in novel ways, such as distributing intersected DFA states across hash table buckets to avoid accessing irrelevant states when their bucket identifier shows them unreachable. The efficiencies from randomized avoidance of unnecessary computations manifest significant speedups. However, randomness only provides guarantees on average case runtime. Worst case instances can still suffer quadratic blowups. So additional techniques are needed for robust linear efficiency.

Linear Time Methods Using Word Level Parallelism

Modern CPU architectures support native word level parallelism on 64-bit registers via bitwise operations and intrinsics. DFA intersection emptiness algorithms designed to leverage such capabilities execute boolean matrix multiplications mimicking state transitions in parallel across entire DFA state spaces. This allows traversing the implicit intersected DFA without ever materializing the state space. Bit vectors indicate reachability eliminating unnecessary visits.

const int wordSize = 64; // 64-bit machine word
 
bitset<wordSize> leftStates; // DFA A states bit vector
bitset<wordSize> rightStates; // DFA B states 
 
leftStates[0] = 1; // DFA A initial state
 
while (!leftStates.none()) { // DFA A states remaining
  for (int i = 0; i < wordSize; ++i) {
    if (leftStates[i]) {
      bitset<wordSize> nextRightStates; 
      nextRightStates =  transitionFunction(i, rightStates); 
      rightStates |= nextRightStates; 
    } 
  }
  leftStates = move(leftStates); 
}
 
if (rightStates.any()) {
  Accepting state reachable; 
} else {
  Intersection DFA empty;
}

The above code skeleton leverages the 64-bit CPU register to encoded entire DFA state sets within machine words and performs breadth first search on the implicit intersection DFA using highly parallel bitwise logic and operations. Despite handling the exponential state spaces underlying the input DFAs, runtime is linear in the number of explicitly encoded DFA states.

Open Problems

Current top algorithms can solve DFA intersection emptiness in near linearithmic (n log n) time for general DFAs without restrictions. However, achieving the optimal O(n) bound remains open. Key challenges include pathological instances still capable of quadratic blowups, and the inability to formulate an approach that maintains linear runtime throughout all steps.

Another active research direction focuses on studying subclasses of DFAs exhibiting special properties that make the intersection emptiness problem easier. Extended methods tailored to such subclasses may produce new linear time algorithms under specific constraints. There also remains potential for humoristic methods that further leverage randomness for general cases.