CUDA-DClust+: Revisiting Early GPU-Accelerated DBSCAN Clustering Designs

Madhav Poudel, Michael Gowanlock

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

Density-based clustering algorithms are widely used unsupervised data mining techniques to find the clusters of points in dense regions that are separated by low-density regions. This algorithm is inherently sequential and has limitations in its parallel implementation. There have been several parallel algorithms presented in the literature for multi-core CPUs and many-core GPUs. One such algorithm for the GPU is CUDA-DCLUST. In this paper, we propose a new GPU-accelerated DBSCAN algorithm with several optimizations. In comparison to prior work, our algorithm, Cuda-dclust+:(i) computes the indexing structure on the GPU, (ii) uses kernel fusion to combine the index search and cluster expansion kernels, which reduces communication and synchronization overhead with the host, and (iii) seed list management control is primarily given to the GPU rather than the CPU, which further decreases CPU-GPU communication overhead. We compare our algorithm to three state-of-the-art parallel algorithms in the literature on six real-world datasets. We find that our algorithm achieves a speedup of up to 23x over the fastest GPU algorithm.

Original languageEnglish (US)
Title of host publicationProceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages354-363
Number of pages10
ISBN (Electronic)9781665410168
DOIs
StatePublished - 2021
Externally publishedYes
Event28th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2021 - Virtual, Bangalore, India
Duration: Dec 17 2021Dec 18 2021

Publication series

NameProceedings - 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics, HiPC 2021

Conference

Conference28th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2021
Country/TerritoryIndia
CityVirtual, Bangalore
Period12/17/2112/18/21

Keywords

  • Clustering
  • DBSCAN
  • GPGPU
  • Graphics Processing Unit
  • Machine Learning
  • Outlier Detection

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Information Systems

Fingerprint

Dive into the research topics of 'CUDA-DClust+: Revisiting Early GPU-Accelerated DBSCAN Clustering Designs'. Together they form a unique fingerprint.

Cite this