Linear programming Introduce by example: Maximizing profit Making hammers & rubber mallets, using steel & rubber. Profit: $2 per hammer, $1 per mallet maximize 2h + 1m Limited supply of materials: Hammer = 3 steel + 1 rubber Mallet = 1 steel + 2 rubber Have 9 steel, 8 rubber Can't make negative things How to represent these constraints? 3h + 1m <= 9 1h + 2m <= 8 h,m >= 0 We'll ignore the real-world issue of integrality. E.g., consider h,m to be millions of units each, instead. Integer linear programming is much harder (NP-complete). Emphasize linear equation form Diagram as n-dimensional polytope (n=#vars) Intersection of each constraint's half-plane "feasible" space Diagram for above example Suggestions on how to solve above example? Polytope has at most m faces (m=#inequalities) Some inequalities may be redundant Under-constrained -> polytope not closed Over-constrained -> polytope empty Diagram gradiants of goal formula Optimum, if any, at some vertex. Check vertices for above example Example: Network flow Network flow definition reminder max sum_v f(s,v) f(u,v) <= c(u,v) capacity f(u,v) = -f(v,u) skew symmetry sum_v f(u,v) = 0, for u!=s,t flow conservation Equivalent to a n-dimensional polygonal space. Many optimization problems can be set up as LP problems, although more specialized algs often faster. General form maximize sum of c.x subject to Ax <= b and x >= 0 how to represent minimization problem? how to represent equality constraints? how to represent strict inequality constraints? how to represent constraints other than non-negativity? How to solve? Simplex -- Danzig 1947 Move vertex to vertex along edges Fairly simple linear algebra Can be optimized a lot Potentially exponential, but good in practice Ellipsoid method (the first interior point method) -- Khachian 1979 Complicated Poly time, but generally impractical Karmarkar's alg. (an interior point method) -- Karmarkar 1984 Complicated Competitive with simplex in practice Other interior point methods since then In practice, simplex & interior point algs used, often in same package They tend to work best on different problems We'll look at simplex Also numerous parallel algorithms, which is an active research area Importance Many real-world optimization problems can be presented as LP problems Intro examples are good, albeit tiny, examples Real-world problems often have thousands of variables & constraints Other real-world optimization problems can be approximated as LP problems Non-linear or integer linear problems are hard to solve Present simplex by example... Use above example, relating each step back to the picture. Slack variables 3h + 1m + x1 = 9 1h + 2m + x2 = 8 Standard form Basic & nonbasic variables x1 = 9 - 3h - 1m x2 = 8 - 1h - 2m ---------------- z = 2h + 1m Represents feasible solution h=0,m=0, with goal=0. Pivot Entering & leaving variables Either h,m can enter -- increasing them increases z. Choose h Heuristic: largest constant in goal -- increasing h increases goal fastest. Either x1,x2 can leave. Choose x1 Heuristic: provides tightest bound on h. x1 = 9 - 3h - 1m -> h = 1/3 (9 - 1m - x1) = 3 - 1/3 m - 1/3 x1 x2 = 8 - 1(3 - 1/3 m - 1/3 x1) - 2m = 5 - 5/3 m + 1/3 x1 z = 2(3 - 1/3 m - 1/3 x1) + 1m = 6 + 1/3 m - 2/3 x1 h = 3 - 1/3 m - 1/3 x1 x2 = 5 - 5/3 m + 1/3 x1 ----------------------- z = 6 + 1/3 m - 2/3 x1 Represents feasible solution m=0,h=3, goal=6. Only m can enter. Either h,x2 can leave. Choose x2 Heuristic: provides tightest bound on h. x2 = 5 - 5/3 m + 1/3 x1 -> m = 3/5 (5 + 1/3 x1 - x2) = 3 + 1/5 x1 - 3/5 x2 h = 3 - 1/3 (3 + 1/5 x1 - 3/5 x2) - 1/3 x1 = 2 - 1/15 x1 - 5/15 x1 + 1/5 x2 = 2 - 2/5 x1 + 1/5 x1 z = 6 + 1/3 (3 + 1/5 x1 - 3/5 x2) - 2/3 x1 = 7 + 1/15 x1 - 10/15 x1 - 1/5 x2 = 7 - 3/5 x1 - 1/5 x2 m = 3 + 1/5 x1 - 3/5 x2 h = 2 - 2/5 x1 + 1/5 x2 ------------------------- z = 7 - 3/5 x1 - 1/5 x2 Represents feasible solution m=3,h=2, goal=7. Nothing can enter. Optimal solution. Finds optimal solution (intuition) No local minima when following vertices Since polygon is convex and goal formula is linear Algorithm keeps increasing goal formula's value ...well, not quite, only doesn't decrease it... When can pivot not increase goal? How to avoid cycling, with goal never increasing? In practice, rare. Simple heuristics of choosing entering and/or leaving variables can avoid. Example: Pick smallest index of tied variables (Proof omitted.) More on correctness later... How can algorithm detect if there is no optimum? If an entering variable is unbounded. Example x2 = 5 + 2x3 - x4 - 3x1 x5 = 7 - 3x4 - 4x1 ------------------------ z = 5 + x3 - x4 - x1 Only x3 can enter A bit more about worst case When can worst case happen? Note that algorithm depends on a couple heuristics for pivoting. For Danzig's original heuristics Klee & Minty, 1972, showed a n-var LP problem with a 2^n-vertex polytope such that every vertex is visited For many other heuristics Similar examples But, unknown whether such examples exist for all pivoting heuristics. What if origin isn't feasible? -- Initialization Explain by example original problem max c.x, Ax<=b, x>=0 max x1 - x2 + x3 2x1 - x2 + 2x3 <= 4 2x1 - 3x2 + x3 <= -5 -x1 + x2 - 2x3 <= -1 x1,x2,x3 >= 0 Note that x1=x2=x3=0 doesn't satisfy constraints. create "auxiliary problem" min x0, Ax-x0<=b, x,x0>=0 with new variable x0 Auxiliary problem is feasible: E.g., x1=x2=x3=0, set x0 "large enough" -- How large? Original problem feasible iff (aux. problem feasible with x0=0). max -x0 -x0 + 2x1 - x2 + 2x3 <= 4 -x0 + 2x1 - 3x2 + x3 <= -5 -x0 - x1 + x2 - 2x3 <= -1 x0,x1,x2,x3 >= 0 x4 = 4 + x0 - 2x1 + x2 - 2x3 x5 = -5 + x0 - 2x1 + 3x2 - x3 x6 = -1 + x0 + x1 - x2 + 2x3 ------------------------------ w = - x0 Initial dictionary of aux. problem is infeasible, but one pivot will always make it feasible. It finds the feasible point previously described. Proof omitted. Only x0 can enter Choose x5 ("most infeasible") to leave x0 = 5 + 2x1 - 3x2 + x3 + x5 x4 = 9 - 2x2 - x3 + x5 x6 = 4 + 3x1 - 4x2 + 3x3 + x5 ------------------------------ w = -5 - 2x1 + 3x2 - x3 - x5 Only x2 can enter Choose x6 to leave x2 = 1 + 0.75 x1 + 0.75 x3 + 0.25 x5 - 0.25 x6 x0 = 2 - 0.25 x1 - 1.25 x3 + 0.25 x5 + 0.75 x6 x4 = 7 - 1.5 x1 - 2.5 x3 + 0.5 x5 + 0.5 x6 ----------------------------------------------- w = -2 + 0.25 x1 + 1.25 x3 - 0.25 x5 - 0.75 x6 Choose x3 to enter Choose x0 to leave x3 = 1.6 - 0.8 x0 - 0.2 x1 + 0.2 x5 + 0.6 x6 x2 = 2.2 - 0.6 x0 + 0.6 x1 + 0.4 x5 + 0.2 x6 x4 = 3 + 3 x0 - x1 - x6 -------------------------------------------- w = - x0 Done, x0=0 is optimal. Furthermore, x0=0 is feasible, so original problem feasible at x1=0, x2=2.2, x3=1.6. Return to original problem: For initial dictionary, could plug those values into original problem and simplify. Easier and equivalent, just modify last dictionary of aux. problem. x3 = 1.6 - 0.2 x1 + 0.2 x5 + 0.6 x6 x2 = 2.2 + 0.6 x1 + 0.4 x5 + 0.2 x6 x4 = 3 - x1 - x6 ----------------------------------- z = ??? z = x1 - x2 + x3 = x1 - (2.2 + 0.6x1 + 0.4x5 + 0.2x6) + (1.6 - 0.2x1 + 0.2x5 +0.6x6) = -0.6 + 0.2x1 - 0.2x5 + 0.4x6 x3 = 1.6 - 0.2 x1 + 0.2 x5 + 0.6 x6 x2 = 2.2 + 0.6 x1 + 0.4 x5 + 0.2 x6 x4 = 3 - x1 - x6 ------------------------------------ z = -0.6 + 0.2 x1 - 0.2 x5 + 0.4 x6 And proceed as usual... (details omitted) Duality Recall that maximizing network flow is equivalent to minimizing cuts, and the basis of the correctness proof A similar "duality" property holds here. "Duality" has numerous meanings in math. These are both examples of duality in optimization problems. In essense, to maximize the objective, want to minimize the slack Dual Problem Given original "primal" problem maximize sum of c.x subject to Ax <= b and x>=0 Define "dual" problem minimize sum of b.y subject to (A^T)y >= c and y>=0 or, in standard form, maximize sum of -b.y subject to -(A^T)y <= -c and y>=0 Thus, #rows (constraints) in original = #cols (vars) in dual #cols (vars) in original = #rows (constraints) in dual Example Primal max x1 - x2 + x3 2x1 - x2 + 2x3 <= 4 2x1 - 3x2 + x3 <= -5 -x1 + x2 - 2x3 <= -1 x1,x2,x3 >= 0 Dual min 4y1 - 5y2 - y3 2y1 + 2y2 - y3 >= 1 -y1 - 3y2 + y3 >= -1 2y1 + y2 - 2y3 >= 1 y1,y2,y3 >= 0 Claim: These have same optimum. See textbook for details.