What is it about?
Graph Community Detection or Clustering is about finding regions of dense connections within a graph. Just by looking at a graph, one could perhaps figure out some dense regions or clusters. As it turns out, this problem is quite hard at least in theory, and identifying an arbitrary number of clusters within a reasonable time not only depends on the size of the graph, but the structure. As such, researchers have proposed different algorithms, heuristics and methods to optimize the graph community detection or clustering problem. We use one such popular iterative method for graph clustering, called the Louvain method, and devise a parallel approach targeting both performance and quality, implementing it on contemporary multi-GPU systems. An important criteria for the Louvain method is a global optimization metric that indicates the quality of clustering, called modularity (a value between 0 and 1); a rather loose analogy is using the heart rate as an indicator for the overall health. Parallel implementations must also pay attention to the trajectory of modularity over clustering iterations.
Featured Image
Photo by JJ Ying on Unsplash
Why is it important?
One way to deal with large graphs in parallel for solving the clustering problem is to engage different resources on distinct portions of a graph - such that these resources can work independently and reduce the overall time. However, a resource may still have to occasionally access data currently processed by another resource. This leads to a dichotomy: use outdated data (such as modularity) to make decisions OR force all the resources to stop work and synchronize, and then use the updated data. The first option can prolong the time as it might require more steps to achieve the expected resolution in terms of modularity, and may also impact the quality of clustering. The second option may lead to serial overheads, as in resources waiting indefinitely. We try to leverage both of the above options to devise GPU-enabled Louvain graph clustering - by introducing a user-defined metric, namely "batches" - to process a graph in batches on a GPU and synchronize at the end of every batch, such that the resources will have a more current view of the data - but within a batch can proceed with the outdated data. We demonstrate that this strategy can lead to interesting trade-offs in performance and quality, and studying this trade-off will be useful in designing efficient parallel clustering methods.
Read the Original
This page is a summary of: Batched Graph Community Detection on GPUs, October 2022, ACM (Association for Computing Machinery),
DOI: 10.1145/3559009.3569655.
You can read the full text:
Contributors
The following have contributed to this page