The Surprising Difficulty Of Counting Problems

Defining Counting Problems

Counting problems involve determining the number of solutions for a particular computational problem. For example, considering the Boolean satisfiability problem (SAT), instead of asking whether a satisfiable assignment exists, a counting version asks “how many satisfying assignments are there?” This is the #SAT problem.

More formally, counting problems take the decision version of a problem as input, but instead of outputting “yes” or “no”, they output the number of solutions. So if P is a decision problem, the associated counting problem #P maps inputs to the nonnegative integers, outputting the number of accepting computations.

Many natural counting problems arise from real-world applications like network reliability, computational biology, and physics simulations. However, while the decision versions of problems like SAT may be easy, the associated counting problems are often quite difficult.

Table of Contents

Complexity Classes for Counting

The complexity class #P contains the counting versions for all problems in NP. For example, counting the satisfying assignments for a Boolean formula is #SAT, counting the number of solutions for a constraint satisfaction problem is #CSP, and counting graph matchings falls into #P as well.

Problems in #P have connections to the polynomial hierarchy and PSPACE. Toda’s Theorem shows that every problem in the polynomial hierarchy can be reduced to #P with just one call. So #P is likely even harder than NP or the polynomial hierarchy.

However, not all counting problems are hard. For example, counting the paths between two nodes in a directed acyclic graph is in FP, and counting graph matchings is in #L. There are tractable special cases, but most natural counting problems seem quite difficult.

#P-completeness and Tractable Cases

Within #P, there is a class of #P-complete problems which are the hardest, in the sense that if any #P-complete problem is tractable, then all problems in #P are tractable. #SAT is perhaps the most famous #P-complete problem. Many other constraint satisfaction and enumeration problems are also #P-hard or #P-complete.

As one might expect based on the complexity of their decision versions, counting problems like #SAT, #CSP, or counting graph colorings seem extremely difficult in the general case. But there are certain restricted cases and problem structures where these problems do become tractable.

For example, counting colorings is tractable on trees and small treewidth graphs. Counting independent sets and matchings is easier on planar graphs. There has been extensive research characterizing which restricted but still useful cases of various counting CSPs allow for efficient exact or approximate counting.

Approximation Algorithms

Since exactly solving #P-complete problems seems out of reach, significant work focuses on developing fast approximation algorithms. These provide estimates or bounds on the solution count, instead of exact values.

A fully polynomial-time randomized approximation scheme (FPRAS) is an approximation scheme that runs in polynomial time and has reasonable accuracy. Designing FPRASes for counting problems like #SAT, even with guarantees of relative error, is still quite challenging.

Common techniques include Monte Carlo approaches using sampling and random hashes, as well as deterministic approaches like inclusion-exclusion. Hybrid methods combine sampling and deterministic counting on restricted cases to get best of both worlds. But for many problems no known FPRAS exists.

Randomized Algorithms

In addition to FPRAS-style Monte Carlo counting algorithms based on sampling, there are other ways randomness helps tackle counting problems.

For example, Dyer, Goldberg, and Jerrum developed a randomized polynomial-time approximation scheme for counting matchings in bipartite graphs, using an application of matrix permanent approximation techniques. There are also applications of statistical physics methods like the simulation of Markov chains to perform approximate counting.

Randomized reduction techniques can relax counting to decision problems, as well. So while deterministic exact solutions remain elusive, randomized methods provide ways to estimate solution counts efficiently for some useful cases.

Example Code for #SAT

Here is some simple example Python code for a basic exponential-time, exact #SAT counter. For a formula in conjunctive normal form, it tries all possible assignments, counts the satisfying ones, and returns the total number of assignments that satisfy the formula.

def sharpSAT(formula):
    num_variables = # number of variables in the formula 
    
    num_solutions = 0
    for assignment in range(2**num_variables):
        satisfies = True
        for clause in formula:
            if clause not evaluates to True under assignment:
                satisfies = False
                break
                
        if satisfies:
            num_solutions += 1
            
    return num_solutions

This simple exhaustive search algorithm runs in exponential time, so faster approximate counters would be needed for practical use on large instances. But it illustrates the concept of determining the solution count by iterating through the entire solution space.