CSCE 470 Lecture 18

Top-down: Bisecting k-means
Bottom-up: Hierarchical Agglomerative Clustering (HAC)

« previous | Friday, October 4, 2013 | next »

Clustering Review

Measure distance between point and every other point and put in a lower triangular matrix:

Constructing this thing takes $O(n^{2})$

Calculating distance between points is as easy as looking them up in the table!

Merge the smallest distance in the table.

Now we have a way to calculate distance between:

point and point: dist(p,q) = sqrt(sum((i-j)**2 for i,j in zip(p,q))
point and cluster: dist(p,c) = min(dist(p,q) for q in c) (or max(); design choice)
cluster to cluster: dist(c1,c2) = min(dist(p,q) for p in c1 for q in c2) (or max(); design choice)