6.5 Reductions

This chapter under construction.

Reduction.

Reduction or simulation is a powerful idea in computer science.... Problem-solving framework that transforms a problem X into a simpler problem Y such that it is easy to deduce a solution to the original problem X based on the solution for Y.

Eric Alander - "Reduction provides an abstraction. If A efficiently reduces to B and B efficiently reduces to A, then A and B are equivalent in a meaningful sense: they are two different ways to look at the same problem. Instead of infinitely many computational problems, we are left with a smaller number of classes of equivalent problems. Nothing had prepared the computing community for the shocking insight that there are really just a handful of fundamentally different computational problems that people want to solve." Partitions natural computational problems into meaningful groups based on resource bounds.

Upper bounds.

3-COLLINEAR to sorting.
Convex hull to sorting (Graham scan).
MEDIAN to SORTING.
Bipartite matching to maxflow. BipartiteMatchingToMaxflow.java computes a maximum cardinality matching in a bipartite graph by reducing the problem to max flow. The order of growth of the running time is E V in the worst case since each augmentation increases the cardinality of the matching by one. The Hopcroft-Karp algorithm improves the running time to E sqrt(V).
LP standard form: a standard form linear program is { max cx : Ax <= b, x ≥ 0 }. Show how to reduce a general linear program (with ≤, ≥, and = constraints) to standard form.
Maximum flow to linear programming.
Assignment problem to linear programming.
PERT to topological sort in a DAG.
USTCONN (s-t connectivity in undirected graphs) to STCONN (undirected s-t connectivity in directed graphs).
Shortest paths on undirected graphs to shortest paths on directed graphs.
Euclidean MST to Delauney triangulation.
LP feasibility reduces to LP. (Use binary search. Need to bound value of a bounded LP.)
Two-person zero-sum games to LP. TwoPersonZeroSumGame.java
LP reduces to two-person zero-sum games. (Exercise 26 in Bradley, Hax, and Magnanti)

Lower bounds.

Element distinctness to sorting.
Closest pair 2D to Euclidean MST 2D.
Convex hull to Voronoi / Delauney triangulation.
3-SUM to 3-COLLINEAR.
Exercise: 3-SUM to 4-SUM.
Exercise: 3-SUM to 3-SUMplus
Exercise: 3-COLLINEAR to 3-CONCURRENT.
3-SAT to INDEPENDENT-SET.
INDEPENDENT-SET to ILP.

Element distinctness. Given N elements x₁, ..., x_n from a totally ordered universe, is there a pair i ≠ j such that x_i = x_j? Theta(N log N) lower bound in comparison tree model and algebraic decision tree model. O(N log N) easy by sorting.
Solution 1 (comparison tree model): Given N distinct values. Let ai be ith smallest element. In any comparison based algorithm for the problem, we must compare a_i against a_i-1; otherwise the algorithm could not distinguish between the case a_i-1 < a_i and a_i-1 = a_i. Consider two different permutations of the N elements. There is an element a_i where a_i-1 precedes a_i in the first permutation but a_i-1 precedes a_i in the second. Since every comparison based algorithm must compare a_i and a_i-1, the algorithm runs differently on the two permutations. Thus, there are at leats N! leaves. reference
Solution 2 (comparison tree model): suppose the elements are all distinct and a_1 < a_2 < ... < a_N. Any correct algorithm for element distinctness must compare a_i with a_i+1 for each i < N. If not, then the algorithm would produce the same output if we changed a_i+1 to a_i (but this would change the answer from no duplicates to has duplicates). The set of compares the algorithm uses forms a DAG. Find the total order (in linear time) and this gives the sorted order. Thus, the algorithm can be used for sorting distinct elements and the sorting lower bound applies.
Remark: these arguments apply to comparision tree model of computation but not to linear decision tree or algebraic decision tree model of computation. Here's a more general argument (courtesy of Jeff Erickson): Consider the space R^n of all possible inputs. The set of positive inputs has n! connected components, one for each permutation. On the other hand, the subset of inputs that can reach any leaf in a linear decision tree is convex, and therefore connected. Thus, any linear decision tree that determines uniqueness has at least n! leaves. The result for linear decision trees is On the complexity of computations under varying sets of primitives. by Dobkin and Lipton. The result for algebraic decision trees is Lower bounds for algebraic computation trees. by Ben-Or. The result for the special case of integer inputs is more subtle: see A lower bound for the integer element distinctness problem by Lubiw and Racs.
3-SUM.
3-SAT.

Linear programming.

Generalization of system of linear equations to inequalities.

lecture slides

Simplex algorithm.

Invented by George Dantzig in 1948. According to Dongarra and Sullivan, the simplex algorithm is one of the top 10 algorithms with the greatest influence in science and engineering in the 20th century. Generalization of Gaussian elimination to handle inequalities. Program LinearProgramming.java is a bare bones implementation. It solves { max cx : Ax <= b, x >= 0 } assuming that b >= 0. Hence x = 0 is a basic feasible solution (and we don't need to worry about Phase I of simplex).

Fundamental theorem of LP. Every LP of form (P) has the following three properties: (i) if it has no optimal solution, it is either infeasible or unbounded (ii) if it has a feasible solution, it has a basic feasible solution (iii) if it has an optimal solution, it has a basic optimal solution.

Strong duality theorem. Every LP of form (P) has a dual (D) { min by : y A >= c : y >= 0 }. p1 + p2 + ... + pM = 1 -> x1 + x2 + ... + xm = 1/V.

If (P) is bounded and feasible, then (D) is bounded and feasible. Moreover, they have the same optimal value. Simplex algorithm solves both problems simultaneously.

Remarkable property. In practice, simplex algorithm typically terminates in at most 2(m+n) pivots. n = total variables (original + slack), m = equations.

Pitfalls. Degeneracy, cycling.

Assignment problem.

Formulate as LP. Relies on the Birchoff-von Neumann Theorem: all extreme points of assignment problem LP are {0, 1}. AssignmentProblemToLP.java reduces the assignment problem (max weight perfect matching) to linear programming. EasyAssignmentProblemToLP.java puts everything in main and doesn't extract the dual solution. Hungarian.java implements the Hungarian algorithm; the order of growth of the running time in the worst case is N^4. --> AssignmentProblem.java implements the successive shortest paths algorithm; the order of growth of the running time in the wrost case is N^3 log N. AssignmentProblemDense.java implements the successive shortest paths algorithm; the order of growth of the running time in the wrost case is N^3.

Two-person zero-sum games.

Reduces to LP. Can assume payoff matrix is strictly positive (adding a constant to each entry increases the value of the game by M, but does not change the solutions). (P) min { x1 + ... + xm, : x >= 0, M^t x >= 1} and (D) max { y1 + ... + yn, : y >= 0, My <= 1}. Normalize the solution vectors x* and y* to get the row player's (minimizing) and column player's (maximizing) optimal strategies; the normalizing constant is the value of the game.

Trick. Replace variables: xi = pi / V; max V -> min 1/V; Game theory, p. 6-7

Program ZeroSumGame.java implements this approach.

Note: any LP can be put into this form by an appropriate change of variables.

The minimax theorem. (von Neumann). For every finite two-person zero-sum game, there exists a value V and a mixed strategy for each player such that (i) given the row player's strategy, the best possible payoff for the column player is V, and (ii) given the column player's strategy, the best possible payoff for the row player is -V.

Linear programming solvers.

LinearProgramming.java is a bare-bones version of the simplex algorithm. It assumes b >= 0, so that x = 0 is a starting basic feasible solution. TwoPhaseSimplex.java is a bare-bones version of the two-phase simplex algorithm, which removes the assumption b >= 0.
OR-Objects contains a Java linear program solver. LPDemo.java demonstrates how to solve a linear program using or124.jar.
QSopt is a Java linear program solver created by David Applegate, William Cook, Sanjeeb Dash, and Monika Mevenkamp. It can be used at no cost for research or education purposes. QSoptSolver.java solves a linear program in LP format, such as beer.lp.

Matlab contains a linear programming solver in the optimization toolbox.

[wayne:tombstone] ~> matlab
                                      < M A T L A B (R) >
                            Copyright 1984-2009 The MathWorks, Inc.
                          Version 7.9.0.529 (R2009b) 64-bit (glnxa64)
                                        August 12, 2009
>> A = [5 15; 4 4; 35 20];
>> b = [480; 160; 1190];
>> c = [13; 23];

>> lb = [0; 0];
>> ub = [inf; inf];
>> x = linprog(-c, A, b, [], [], lb, ub)
x =
    12.0000
    28.0000

CPLEX is a high-performance mathematical programming solver for linear programming, mixed integer programming, and quadratic programming. It supports interfaces to C, C++, Java, Python, Matlab, and Microsoft Excel. It is also accessible via the modeling systems including AIMMS, AMPL, GAMS, and MPL.

AMPL is a modeling language for mathematical programming. The files beer.mod and beer.dat specify the model and data for the brewery problem.

[wayne:tombstone] ~> ampl
ILOG AMPL 9.100
AMPL Version 20021038 (SunOS 5.8)

ampl: model beer.mod;

ampl: data beer.dat;

ampl: solve;
ILOG CPLEX 9.100 
CPLEX 9.1.0: optimal solution; objective 800
2 dual simplex iterations (1 in phase I)

ampl: display x;
x [*] :=  ale 12  beer 28  ;

ampl: display constraints.dual;
x [*] :=  corn 1  hops 2  malt 0  ;

Microsoft Excel has a primitive solver add-in for Windows, but it is no longer available for Mac.

Q + A

Q. Undirected shortest paths with negative weights to directed shortest paths?

A. Need more clever reduction to avoid negative cycles (via nonbipartite matching).

Exercises

gcd and lcm. Reduce lcm to gcd.
Table k-sum. Given a k-by-N table of integers, are there k that sum to 0, one from each of the k rows?
Coin distinctness. Given N coins and a balance scale, determine if the coins all weigh different amounts. Prove that any algorithm that solves the problem has complexity at least N log N. Hint: you may use the fact that given N real numbers, element distinctness requires at least N log N comparisons in the linear decision tree model (where you can make comparisons between arbitrary linear combinations of the N elements, e.g, x1 + 2x2 < x3).
Set equality. Given two sets S and T, does S equal T? Give an O(N log N) algorithm and a matching lower bound.
Set subset. Given two sets S and T, is S a subset of T? Give an O(N log N) algorithm and a matching lower bound.
Set disjointness. Given two sets S and T, does S intersect T equal the empty set? Give an O(N log N) algorithm and a matching lower bound. (Since S and T are sets, there are no duplicated elements in either.)
Solution. To get the O(N log N) upper bound, sort the union of the elements in S and T and check for duplicates.
Set disjointness. Given two sets S and T, each in ascending order, does S intersect T equal the empty set? Prove an Omega(N log N) lower bound or given an O(N) algorithm.
Lower bound for merging two lists. Show any comparison-based algorithm for merging two lists of size N requires at least 2N-1 comparisons in the worst case. Solution: lower bound holds even if all elements are distinct. If the ith and (i+1)st smallest elements are in different lists, then they must be compared.
Lower bound for binary search. log(N+1) comparisons required. Solution: lower bound holds even if all elements are distinct....
Euclidean TSP and MST lower bounds. Given an Omega(N log N) for the Euclidean TSP or Euclidean MST (requires algebraic decision trees to make sense since you want to allow a reasonable algorithm to compute distances). Solution: choose N points on a line. Optimal tour must visit the points in ascending order.
Lower bound for mode. Theta(N log N) bound for mode. Solution: could solve element distinctness problem by finding mode and checking whether it is more than one.
Lower bound for closest pair. Theta(N log N) bound for closest pair. Solution: could solve element distinctness problem in sub-lineararithmic time if had a sub-lineararithmic algorithm by finding closest pair and checking if they are equal.
Smallest and second smallest. Describe how to find both the smallest and second smallest element using at most N + log N comparisons. Solution: divide the elements into pairs and compare two elements in each pair. Recur with the N/2 winners from each pair. After N-1 comparisons, we have the minimum element. Note that each element is compared with at most log N other elements. In particular, the smallest element is compared with at most log N other elements, and one of these must be the second smallest. If you keep track of all the comparisons, you can remember the log N involving the smallest element and compare these.
Counting inversions. Prove a Theta(N log N) lower bound. This is an open research question.
Nuts and bolts. Prove a Theta(N log N) lower bound for the nuts and bolts problem in the quicksort section. [A matching worst case upper bound of O(N log N) is known, but it is remarkably complicated.]
Existence symbol table. Suppose you had a comparison-based algorithm that supports insert and exists in O(1) comparisons per operation. Explain why this is impossible in this model of computation. Solution: to solve element distinctness in O(N) time, for each of the N elments, check whether it's already in the data structure (if so, then this is a duplicate); if not, insert it.
Model a constraint of the form x ≤ y using integer programming (Ax = b, x >= 0, x integral).
Model a constraint of the form x = { 0 or 1 } using integer programming.
Model a constraint of the form x = { 1 or 2 or 4 } using integer programming.
Tait coloring. Model 3-edge coloring cubic graph using IP.
Assign organ donors to patients.
Bipartite vertex cover. Reduce to max flow.
Transitive reduction and transitive closure. Reduce transitive reduction to transitive closure and vice versa (when running time is measured only as a function of the number of vertices V). Also reduces to Boolean matrix multiplication.
3SUM'. Given three sets of integers A, B, and C of total size n, are there a in A, b in B and c in C such that a + b = c? Prove that 3SUM linear-time reduces to 3SUM' and vice versa.
Solution. To show that 3SUM reduces to 3SUM', set A = S, B = S, and C = -S.
To show that 3SUM' reduces to 3SUM, assume that all integers are positive (if not, add k to each element in A and B and 2K to each element in C). Set m = 2 max(A, B, C). For each element a in A, put a + m in S. For each element b in B, put b in S. For each elemenet in C, put -c -m in S.
Inequalities satisfied with equality. Given a system of linear inequalities A x ≤ b, design a linear program to determine which inequalities among A x ≤ b must be satisfied with equality in any feasible solution x.
Equality constraint. Model a linear programming equality constraint using two <= constraints.
Unrestricted variables. Model a linear programming unrestricted variable x using two nonnegative variables y and z.
Unbounded LP. If (P) is unbounded, then there exists a vector d >= 0 such that Ad <= 0 and cd > 0. First explain why if such a vector d exists, then the problem is unbounded. Next, modify the simplex algorithm to report such a vector when the LP is unbounded. Answer: by assumption b >= 0 and x = 0 is feasible. Thus α d is feasible for any positive constant α and has objective value α cd. The vector d can be identified by checking if the min ratio rule fails. In this case, the entries in column q are nonpositive (Ad <= 0) and the objective function coefficient is positive (cd > 0).
Reduced costs. Modify the simplex algorithm to report the reduced costs (shadow prices). Give economic interpretation.
Dantzig's steepest edge rule. Modify the simplex algorithm so that it always chooses the most positive objective function coefficient.
Cycling. Give a pathological linear programming input that causes the simplex algorithm (using the steepest edge rule) to cycle.
Bottneck assignment problem. Given N men and N women. Each person has M attributes. Each person specifies a set of desirable attributes of people of the opposite gender. Find a perfect matching in which the most unlucky person is matched with a partner with the fewest number of specified attributes. Reduce this problem to the assignment problem.
Solution. Create an edge-weighted graph where the weight of an edge is the minimum number of desirable attributes satisfied by either person. The goal is the maximize the weight of the minimum-weigh edge in an assignment. One solution to this problem is to use binary search (eliminate all edges of weight less than a given threshold) and solve the resulting bipartite perfect matching problem.
Chinese postman problem. Given a strongly connected digraph G, find a directed cycle of minimum length that that traverses each edge at least once. (It turns out that the best cycle will visited each edge at most twice.) The graph is Eulerian if each vertex is balanced: its degree equals its outdegree. Divide the unbalanced vertices into two sets: L = vertices with outdegree < indegree, R = vertices with outdegree > indegree. By adding a directed path from a vertex in L to one in R, we improve the balance. Form a weighted bipartite graph on (L, R) where the weight of an edge from v to w is the length of the shortest path from v to w in G. Find a min weight matching and add these edges to G to make it Eulerian. Then, find a cycle. This is known as the Edmonds-Johnson (1973) algorithm. (Similar algorithm works for undirected graphs, but need to find a min weight perfect matching in a nonbipartite graph.)