What is it about?

This method allows one to group data objects according to similarity. The meaning of "similarity" is arbitrary, as long as it can be expressed as a number between 0 (completely dissimilar) and 1 (identical). Among other things, this method allows one to easily identify objects that are dissimilar from all other members of the data set. This can be useful for finding rare or anomalous cases.

Featured Image

Why is it important?

There are many standard clustering and partitioning algorithms in wide use. This one combines certain features not found together in other algorithms. Examples include the ability to specify a hard minimum on the mutual similarity of members assigned to a group and the ability to specify similarity based on completely arbitrary criteria.

Perspectives

I devised this method because I had a specific task to accomplish that seemed impossible to accomplish using the existing methods I knew about. I assumed that I was simply re-inventing something that already existed, especially because it is so simple, but after a lengthier search and consultation with clustering experts, I concluded that this was probably a new method, leading to my decision to publish it.

Grant Petty
University of Wisconsin-Madison

Read the Original

This page is a summary of: The Pairwise Similarity Partitioning algorithm: a method for unsupervised partitioning of geoscientific and other datasets using arbitrary similarity metrics, Artificial Intelligence for the Earth Systems, August 2022, American Meteorological Society,
DOI: 10.1175/aies-d-22-0005.1.
You can read the full text:

Read

Contributors

The following have contributed to this page