Sorting large datasets with heterogeneous CPU/GPU architectures

Michael Gowanlock, Ben Karsin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

We examine heterogeneous sorting for input data that exceeds GPU global memory capacity. Applications that require significant communication between the host and GPU often need to obviate communication overheads to achieve performance gains over parallel CPU-only algorithms. We advance several optimizations to reduce the host-GPU communication bottleneck, and find that host-side bottlenecks also need to be mitigated to fully exploit heterogeneous architectures. We demonstrate this by comparing our work to end-to-end response time calculations from the literature. Our approaches mitigate several heterogeneous sorting bottlenecks, as demonstrated on single- and dual-GPU platforms. We achieve speedups up to 3.47x over the parallel reference implementation on the CPU. The current path to exascale requires heterogeneous architectures. As such, our work encourages future research in this direction for heterogeneous sorting in the multi-GPU NVLink era.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages560-569
Number of pages10
ISBN (Print)9781538655559
DOIs
StatePublished - Aug 3 2018
Externally publishedYes
Event32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018 - Vancouver, Canada
Duration: May 21 2018May 25 2018

Publication series

NameProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018

Conference

Conference32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
Country/TerritoryCanada
CityVancouver
Period5/21/185/25/18

Keywords

  • GPGPU
  • Heterogeneous architecture
  • Sorting

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Sorting large datasets with heterogeneous CPU/GPU architectures'. Together they form a unique fingerprint.

Cite this