GDBOD: Density-Based Outlier Detection Exploiting Efficient Tree Traversals on the GPU

Revanth Reddy Munugala, Michael Gowanlock

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Outlier detection algorithms are employed across numerous application domains. In contrast to distance-based outlier detection algorithms that compute distances between points, hypercube-based algorithms reduce computational costs by evaluating the density of a point based on its enclosing hypercube. A major limitation of state-of-the-art hypercube-based algorithms is that they do not scale to large datasets. This paper proposes GPU Density-Based Outlier Detection (GDBOD) that is supported by efficient tree-based hypercube search methods. We propose two GPU-friendly n-ary tree data structures for efficient hypercube searches which are optimized to obtain good locality and exploit the fine-grained parallelism afforded by the GPU. Also, we propose a data encoding method that compresses data to reduce the number of comparisons during distinct hypercube array construction and reorder the coordinates of the input dataset to enhance neighborhood search performance. Additionally, we design sequential and multi-core CPU algorithms that can be employed on systems not equipped with GPUs. Our sequential CPU algorithm achieves a mean speedup of 18.35× over the state-of-the-art and our parallel GPU algorithm achieves a mean speedup of 3.29× over our multi-core CPU algorithm across 6 real-world datasets. With our proposed optimizations on the GPU, we achieve a peak compute throughput of 86.51%, along with 92.06% L1 cache hits and 92.94% L2 cache hits.

Original languageEnglish (US)
Title of host publicationProceedings - 2024 IEEE 31st International Conference on High Performance Computing, Data, and Analytics, HiPC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages111-121
Number of pages11
ISBN (Electronic)9798331509095
DOIs
StatePublished - 2024
Event31st Annual IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2024 - Bangalore, India
Duration: Dec 18 2024Dec 21 2024

Publication series

NameProceedings - 2024 IEEE 31st International Conference on High Performance Computing, Data, and Analytics, HiPC 2024

Conference

Conference31st Annual IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2024
Country/TerritoryIndia
CityBangalore
Period12/18/2412/21/24

Keywords

  • Data Analytics
  • GPU
  • In-memory Databases
  • Outlier Detection

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Graphics and Computer-Aided Design
  • Computer Science Applications
  • Information Systems
  • Information Systems and Management
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'GDBOD: Density-Based Outlier Detection Exploiting Efficient Tree Traversals on the GPU'. Together they form a unique fingerprint.

Cite this