6.4   Euclidean Graphs


This chapter under construction.

machine learning datasets

Introduce notion of distance. Euclidean. Other distances?

Single-link agglomerative clustering.

Program SingleLink.java implements single link agglomerative clustering (dense Kruskal) using the Vector.java ADT from Chapter 2.

Voronoi and Delaunay.

Rich history dating back to Descartes (1644) Dirchlet (1850). In 1854, John Snow used Voronoi diagrams to diagnose cause of Soho cholera outbreak - most people lived closer to the infected Broad Street water pump than any other non-infected pump. Named after Russian mathematician Georgy Voronoi (1908). In 2D called Voronoi diagram.

Algovision contains a number of algorithmic visualizations, including a nice Voronoi and Delaunay demo.

Invented by Borid Delaunay in 1934. Delaunay triangulation: triangulation of set of points in the plane such that no point is inside the circumcircle of any triangle. Delauanay triangulation exists and is unique if points are in general position (no three on same line, no four on same circle). Otherway it may not exist (three points on a line) or be unique (four vertices of a rectangle).

Key operation: given three points (in a triangle), does a four point lie in the circumcircle of that triangle. Give 4-by-4 determinant calculation here.

Property A. Delaunay triangulation maximizes the minimum angle among all triangulations. (Does not necessarily minimize the maximum angle.)

Property B. Delaunay triangulation has O(N) edges. (Follows from planarity.)

Property C. Delaunay triangulation contains the convex hull.

Property D. Delaunay triangulation contains the Euclidean MST.

Sweep line algorithm doesn't work directly since a site that lies in the region x <= a may create a Voronoi vertex in the region x > a. Can fix by using a beach line (points equidistant from p and sweep line). Voronoi reference Fortune's algorithm reference

Property A makes the Delaunay triangulation especially use to construct meshes in finite element methods, where a region is decomposed into into polygons. Small angles lead to numerical instability. (???)

Exercises

  1. Given a set of points for which a Voronoi vertex lies outside the convex hull of the points.
  2. Euclidean MST. Given N point in 2D in general position (no 4 points are cocircular), prove that the Euclidean MST is a subgraph of the Delaunay triangulation. (Property D)

    Solution: Let (v, w) be an edge of the MST which is not in the Delaunay triangulation. Then consider the smallest circle with v and w on the boundary, i.e., v-w is the diameter of the circle. Since v-w is not in the Delaunay triangulation, there is another point x inside the circle. Now, we have v-w is the heaviest edge on the cycle v-w-x. The cycle rule implies that v-w is not in the MST.

    David Eppstein maintains a nice list of geometric MST applications.

  3. Closest pair. Given N points in the plane, find the closest pair (in Euclidean distance).

    Solution: only need to look at Euclidean MST edges. Why? The cut property for MST.

  4. All nearest neighbors. Given N points in the plane, find a nearest neighbor (in Euclidean distance) for each. Show that you only need to consider adacent points in the Delaunay triangulation (or Euclidean MST).
  5. Bichromatic closest pair. Given N red points and N blue points in 2D, the bichromatic closest pair problem is to find a red and blue pair that are the closest. Show that there exists such a pair that is an edge in the Euclidean MST of the 2N points.

Exercises

  1. Euclidean MST. Explain why the following algorithm does not work for Euclidean MST: sort by x-coordinate and divide into two halves. Find MST in left half; find MST in right half; add shorteset edge from point in left half to point in right half.
  2. Euclidean MST. True or false. The Euclidean MST of a set points in the plane is always greater than or equal to the Euclidean MST of a subset of the points.

    Answer: False. Steiner point can decrease total cost.

  3. Single link agglomerative clustering. Rewrite the single link agglomerative clustering algorithm, but only use the upper triangular part of the matrix to conserve half the space.
  4. Implicit MST. Suppose you need to find an MST in a dense graph where the edge weights are known implicitly. That is, for any pair of vertices, you call the function dist(v, w) to get the weight of the edge between them. How would you find an MST? Justify your answer. Answer: Use the dense version of Prim's algorithm. If you use Kruskal, you need to enumerate all N^2 edges and sort them. This takes memory proportional to N^2. On the other hand, Prim's algorithm only need extra memory proportional to N.