Tuesday, December 26, 2006

Kinds of Hard

There are two kinds of people in the world: those who divide everything into two categories, and those who . . . um . . . you know . . . Well, I think it’s funny . . . . I guess I’ve been watching too much Daily Nut, where pretty witty young things start every show with jokey chit-chat.

Back to regular programming: My hypothesis is that the hardest problems are changing as we build an increasingly intangible world. The question is, what makes a problem hard? So as usual when I’m stuck, here’s a taxonomy.

Let’s break problem solving into three stages:
  1. Framing the problem
  2. Solving it
  3. Validating the solution
For each of these stages, one can identify degrees of difficulty. A difficulty in just one of the stages can make the overall problem hard. For example, the problems of board games like go and chess are easy to frame, and it’s clear when a solution has been reached; however, playing the game is hard. One can mix and match the degrees of difficulty in each stage to build a very large number of problem types.

If you know of situations that don’t fit into this scheme, Dear Reader, please let me know. I’ll be looking out for them, too.

1. Framing

In the easiest case, a problem can be clearly stated (e.g. game rules). It gets harder when the problem statement evolves as the solution emerges (e.g. wicked problems). There are cases where it’s not agreed that a problem exists (e.g. global warming until recently). And then there are cases where you don’t even know there’s a problem (e.g. Donald Rumsfeld’s “unknown unknowns”).

2. Solving

Some problems are easy to solve, like tic-tac-toe. Others have multiple solutions (e.g. teaching kids to read). Even mathematically hard problems can have millions of solutions (e.g. superstring theory). Many problems are soluble, but may or may not be hard (e.g. optimization). Some cannot be solved at all (e.g. calculating Omega, Chaitin’s halting probability). In some cases it’s not clear how to even start solving the problem, let alone what path to follow (e.g. speculative software projects). In others, humans think they instinctively know the answer, but in fact get it wrong (e.g. mistakes we make interpreting statistics).

3. Validating

Checking the acceptability of a solution may be a trivial mechanical matter even though finding the solution in the first place is hard (e.g. factorizing primes). Some problems may have many candidate solutions, but no way to know in advance which one is right, or best (e.g. improving the situation in Iraq). There may be no way of knowing whether an action has solved a problem (e.g. building a road to address congestion). There may be no agreement in a community about whether what’s been presented is a valid solution (e.g. the computer proof of the four-color theorem).

Some other approaches

There are many ways of categorizing problems, and I’m just beginning my collection.

Nancy Roberts works in the “wicked problem” tradition and defines three types of problems in “Wicked Problems and Network Approaches to Resolution” (International Public Management Review Vol. 1, No. 1, 2000):
  • Simple: there is consensus on a problem definition and solution
  • Complex: although problem solvers agree on what the problem is, there is no consensus on how to solve it.
  • Wicked: problems with four characteristics – no consensus on the definition of the problem; a vast and diverse group of stake holders; constraints on the solution are constantly shifting; and no consensus on the solution of the problem.
Another distinction goes all the way back to Aristotle, dividing problems between those susceptible to formal logic, and those needing argumentation. There are differences in method between the two, but also differences in validation. Argumentation is made before a specific audience, and the case succeeds if the audience is persuaded. Logic’s audience is hidden, encoded in its rules and the criteria of valid proofs. The difference between logic and rhetoric is reminiscent of the distinction I drew between “plain ol’ hard” and “human-hard” problems, but is not quite the same: some problems of logic, e.g. in statistics, are human-hard.

In computer science, problem hardness is often measured by the time it takes to reach a solution as a function of the problem size, n. If the time taken is a polynomial function of the problem size, e.g. order(n) or order(n^3), the problem is said to be in the complexity class P. There is a hierarchy of such classes. Problems in the EXPTIME class are solvable by a deterministic Turing machine in time order(2^p(n)), where P(n) is a polynomial function of n. P is a proper subset of EXPTIME. The class NP contains decision problems where solutions can be verified (but not necessarily found) in polynomial time: P {subsets} NP {subsets} EXPTIME. The hardest problems in NP are NP-complete. For many NP-complete problems, typical cases are actually easy to solve; Cheeseman, Kanefsky and Taylor showed that such problems can be summarized by an “order parameter,” and that the hard problems occure at a critical value of such a parameter (“Where the Really Hard Problems Are,” Proceedings of the IJCAI-91).

Coda

As always, many thanks to the many people who’ve helped me think about this topic. In this case I’m particularly grateful to Miladin Pavlicic for helping me understand some different kinds of software project challenges.

No comments: