Intro You've probably heard the term "NP-Complete", our next subject. But likely don't have a strong idea of what that means. Start with a quick overview of some concepts and terminology for context. Background We've seen lots of problems that can be solved in O(log n) O(n), O(n^2), O(2^n), ... (sequential deterministic) time. Although we've concentrated on running time, could also look at space. Only count the space needed during computation, not that also needed to hold the input. O(1), O(log n), O(n), O(n^2), ... space is common. O() notation abstracts some details of costs. constant factors lower-order terms Complexity classes To make some large distinctions, it can be useful to further abstract these. P = sequential, deterministic, polynomial time EXP = sequential, deterministic, exponential time NP = sequential, nondeterministic, polynomial time LOGSPACE = sequential, deterministic, logarithmic space PSPACE = sequential, deterministic, polynomial space ... Important aside: What is nondeterminism? Explain with examples Three ways to describe: 1) Algorithm can guess (correctly), but must check correctness. 2) Algorithm can guess (all possible choices in parallel), but must check correctness. 3) Algorithm is also given a "proof", i.e., the correct guesses, and must check them. Other complexity classes, e.g., NC = problems that can be solved by a sequential circuit with a polynomial number of bounded-fan-in gates and having a polylogarithmic depth. = problems that can be solved by a parallel computer with polynomial processors in polylogarithmic time. ... Many based on alternative models of computation. E.g., different models of parallelism. Many based on alternative logics E.g., NC -- circuits = propositional logic Equivalent to hard it is to state the problem in the logic Also, classes allowing randomness or other computational features. Comparing complexity classes Allows us to quantify the usefulness of concepts like nondeterminism and parallelism. Allows us to quantify the usefulness of "big jumps" in resources, e.g., logarithmic, polynomial, exponential. Pragmatically, P considered "tractable", everything else "intractable". I.e., if it's not polynomial deterministic time, it's too slow. By definition, e.g., P <= EXPTIME LOGSPACE <= PSPACE <= EXPSPACE P <= PSPACE EXPTIME <= EXPSPACE P <= NP How are other things related? E.g., time vs. space? Which inequalities are strict or equalities? Can prove, e.g., LOGSPACE <= NC <= P <= NP <= PSPACE <= EXPTIME <= EXPSPACE BIG OPEN QUESTION: P = NP ? Most people think not, but still open question. Will indirectly explore this next. Pragmatically interesting since there are useful problems known to be in NP, but not known to be in P. Proving P = NP would lead to much better algorithms. Focus on such problems. NP examples Boolean algebra formula satisfiability (SAT) What is a proof? How can we check it? Finding longest acyclic path in graph What is a proof? How can we check it? Note that proof can be large, relative to graph These are also "NP-complete", but need more background to define term. Decision problems As mentioned in course intro, sufficient to limit attention to decision problems. Determining f(x)=y is equivalent to what decision problem? Reductions Simple idea: can solve one problem A by using sol'n to another problem B as a subroutine. More specifically, can find f,g such that A(x) = g(B(f(x))) A reduces to B A <= B draw diagram Note: g is typically trivial, and is even omitted in some presentations. Proving a reduction = defining a suitable f,g Polynomial-time reductions When talking about NP, willing to have f,g use deterministic poly. time. I.e., f,g are relatively easy compared to NP. A <=p B Polynomial-time equivalence A <=p B and B <=p A Examples sorting <=p convex hull Others from earlier in the course? Using reductions Most familiar use: B is easy/known. Showing A <= B shows that A is easy/known. Another use: A is hard. Showing A <= B shows that B is hard. (Since if B were easy, then A would be easy, too.) NP-completeness Some NP problems are "NP-complete". E.g., two previous NP examples Usefulness & motivation: NP-complete problems "stand together". If one is in P, all are. NP-complete is strict subset of NP. Draw Venn diagram. B is NP-hard = for all A in NP, A <=p B Not all NP problems are NP-hard Trivial example: Program that always returns 42 NP-hard includes problems not in NP Trivial example: an NP oracle Draw Venn diagram. B is NP-complete = B is NP and NP-hard Draw Venn diagram. Thus, for any NP-complete A,B, A =p B. More generally, also interesting to look at PSPACE-complete, EXPTIME-complete, etc. Proving something NP-complete Once we show one problem A as NP-complete, easier to show B NP-complete. Show B is NP. Show A <=p B. Since all NP problems reduce to A, then all NP problems also reduce to B. In _principle_, when showing B NP-complete, can use any NP-complete A. For a given B, some choices of A are easier. Showing a first problem NP-complete is tougher. Resort to original definition of NP-hard. SAT is NP-complete SAT = Boolean algebra formula satisfiability = Boolean combinatorial circuit satisfiability SAT in NP Shown in introduction SAT is NP-hard for all A in NP, A <=p SAT Given arbitrary A, show reduction. How? Intuition: Computer is just a circuit! If A is NP, we have an algorithm, and can translate that into circuit, i.e., computer with some memory cells initialized with code. But computers aren't _sequential_ circuits. How to fix intuition? Each given input has a specific size. Build a circuit by fully unrolling for that size. For A, we have subroutine AC, which builds a circuit for A. From A's input size, AC computes A's running time (n cycles). AC builds circuit by unrolling n times. Circuit output is the one bit of output. SAT uses this circuit. Further details of construction are tedious. Using SAT When showing other problems NP-hard, sufficient to show SAT <=p B. Show diagram. Reduction f must "translate" any circuit to a B-input. Often difficult, because circuits can have many forms. Would be easier if we restricted the circuits... 3CNF-SAT is NP-complete 3CNF form = ... and (a or -b or c) and (-c or -d or e) and ... 3CNF-SAT in NP Same idea as with SAT 3CNF is NP-hard Sufficient to show what? SAT <=p 3CNF-SAT Show diagram. Intuition: Let 3CNF-SAT problems' variables be the "wires" of the SAT problem. For each gate in SAT problem, create equivalent 3CNF-SAT clauses. E.g., a NAND b = c <-> (a | c) & (b | c) & (-a | -b | -c) Two more examples... See handout.