Hybrid CPU/GPU clustering in shared memory on the billion point scale

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations

Abstract

Many applications require clustering data using an unsupervised approach. One such clustering algorithm is Dbscan, which is inherently sequential, thus limiting parallelization opportunities. Consequently, several recent works have proposed novel shared- and distributed-memory approaches for scaling Dbscan. We propose BPS-HDbscan, a shared-memory CPU/GPU approach that clusters on the billion-point scale. The major pillars of BPS-HDbscan are as follows: (i) distance calculation avoidance in dense data regions; (ii) efficient merging of subclusters; (iii) obviating limited GPU memory capacity by both batching the result set and partitioning the input dataset; and, (iv) computing data partitions in parallel, which effectively exploits both CPU and GPU resources. BPS-HDbscan is highly efficient, and to our knowledge, is the first shared-memory Dbscan algorithm to cluster on the billion point scale.

Original languageEnglish (US)
Title of host publicationICS 2019 - International Conference on Supercomputing
PublisherAssociation for Computing Machinery
Pages35-45
Number of pages11
ISBN (Electronic)9781450360791
DOIs
StatePublished - Jun 26 2019
Event33rd ACM International Conference on Supercomputing, ICS 2019, held in conjunction with the Federated Computing Research Conference, FCRC 2019 - Phoenix, United States
Duration: Jun 26 2019 → …

Publication series

NameProceedings of the International Conference on Supercomputing

Conference

Conference33rd ACM International Conference on Supercomputing, ICS 2019, held in conjunction with the Federated Computing Research Conference, FCRC 2019
Country/TerritoryUnited States
CityPhoenix
Period6/26/19 → …

Keywords

  • DBSCAN
  • GPU
  • Heterogeneous computing
  • Parallel clustering

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Hybrid CPU/GPU clustering in shared memory on the billion point scale'. Together they form a unique fingerprint.

Cite this