Rank-based linkage I: triplet comparisons and oriented simplicial complexes

02/04/2023
by   R W R Darling, et al.
0

Rank-based linkage is a new tool for summarizing a collection S of objects according to their relationships. These objects are not mapped to vectors, and “similarity” between objects need be neither numerical nor symmetrical. All an object needs to do is rank nearby objects by similarity to itself, using a Comparator which is transitive, but need not be consistent with any metric on the whole set. Call this a ranking system on S. Rank-based linkage is applied to the K-nearest neighbor digraph derived from a ranking system. Computations occur on a 2-dimensional abstract oriented simplicial complex whose faces are among the points, edges, and triangles of the line graph of the undirected K-nearest neighbor graph on S. In |S| K^2 steps it builds an edge-weighted linkage graph (S, ℒ, σ) where σ({x, y}) is called the in-sway between objects x and y. Take ℒ_t to be the links whose in-sway is at least t, and partition S into components of the graph (S, ℒ_t), for varying t. Rank-based linkage is a functor from a category of out-ordered digraphs to a category of partitioned sets, with the practical consequence that augmenting the set of objects in a rank-respectful way gives a fresh clustering which does not “rip apart“ the previous one. The same holds for single linkage clustering in the metric space context, but not for typical optimization-based methods. Open combinatorial problems are presented in the last section.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset