Subexponential Time Algorithms For Cnf-Sat

The Boolean satisfiability problem (SAT) asks whether there exists an assignment to variables that satisfies a given Boolean formula. SAT is a canonical NP-complete problem with widespread applications in areas like hardware verification, software testing, and artificial intelligence. Conjunctive normal form SAT (CNF-SAT) focuses on formulas expressed as ANDs (conjunctions) of ORs (disjunctions), a practical restricted form of SAT. Exact CNF-SAT solvers have seen huge advances but still struggle on hard instances. This article reviews subexponential time algorithms – those faster than brute force – developed to accelerate solving.

Defining the CNF-SAT Problem

A CNF formula F on Boolean variables x1, …, xn is a conjunction (AND) of m clauses, where each clause is a disjunction (OR) of literals. A literal is a variable xi or its negation x’i. An assignment of true/false values to the variables satisfies a clause if at least one literal evaluates to true under that assignment. F is satisfiable (SAT) if there exists some assignment satisfying all m clauses simultaneously.

Since each clause eliminates at most one variable assignment, testing all 2ⁿ possible assignments requires worst-case time O(2ⁿ). This exponential complexity severely limits performance as n grows. Real-world problems often have thousands of variables, motivating research into more efficient algorithms.

Table of Contents

Subexponential Time Algorithms

An algorithm runs in subexponential time if for input size n its time complexity bound has the form 2^o(n), growing asymptotically slower than 2ⁿ. While subexponential algorithms still have exponential runtimes practically, the o(n) exponent offset enables solving larger inputs faster than brute force enumeration.

DPLL backtracking search, clause learning SAT solvers, survey propagation, and subexponential tree and Davis-Putnam resolution algorithms exemplify breakthrough subexponential methods for CNF-SAT. Though conceptually straightforward, optimizing and combining their mechanisms to handle real-world problems involves deep theory and engineering.

Implementing DPLL Search

The Davis-Putnam-Logemann-Loveland (DPLL) algorithm performs systematic backtracking search through the 2ⁿ variable assignments. Its core divides the formula through branching on an unset variable, recurses on one branch assignment, then backtracks and recurs on the opposite assignment. Additional propagation rules prune infeasible branches early.

In code, we maintain a global assignment setting variables as we branch. Each recursive call tests if this assignment satisfies the formula, halting and returning true if so. Otherwise we select an unset variable, try true then false assignments to it, invoking recursion on children subproblems before backtracking.


bool dpll(F):
  if F is empty:
    return true  
  if F has empty clause:
    return false
  
  for unset variable x in F:
    assign(x = true)  
    if dpll(F) == true:  
      return true
    assign(x = false)
    if dpll(F) == true:
      return true
              
  return false //all assignments failed

Adding pruning rules like unit propagation strengthens this simple backtracking approach. Key optimizations described next also drastically speed it up.

Scaling with Advanced Heuristics

Carefully selecting the variable branching order impacts the tree size exponential in depth. Advanced heuristics exploit problem structure to prune harder branches first. The maximum occurrences in minimum sized clauses (MOM) rule counts variable appearances favoring those most constraining.

Random restart policies cancel stagnating searches to retry with fresh branching. Random noise or walks also shake search out of unstable states. Coupling such stochastic aspects with deterministic ranking metrics like MOM and activity produce state-of-the-art solvers.

Putting It All Together: A Complete Solver

Incorporating learnt clauses, theory propagation, restarts, randomized branching, and heuristic ranking yields an effective end-to-end CNF-SAT system. Parallel distribution or caching redundant work further enhances throughput. Treating solving as a portfolio addresses variability across problem classes.

Extensive benchmarking documents order-of-magnitude speedups on standard test sets. Yet room remains on challenging application instances with thousands of variables. Domain-specific symmetry breaking or reduction to this NP-complete framework often aids solving.

Future Directions

Recent SAT competitions demonstrate progress saturated on classical benchmarks. Attacking long-standing theoretical barriers around exponential lower bounds may require fundamentally different techniques. Areas like proof complexitysymmetry provide connected questions motivating research.

On the practical side, integrating modern machine learning for richer heuristic self-adaptation shows promise. Declarative programming paradigms also simplify solver experimentation. Ultimately substantial progress requires crossing the gap between theory and practice in this deep domain.