TY - GEN
T1 - HEGJoin
T2 - 25th International Conference on Database Systems for Advanced Applications, DASFAA 2020
AU - Gallet, Benoit
AU - Gowanlock, Michael
N1 - Funding Information:
Science Foundation under Grant No. 1849559.
Funding Information:
This material is based upon work supported by the National Science Foundation under Grant No. 1849559.
Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - The distance similarity join operation joins two datasets (or tables), A and B, based on a search distance, and returns the pairs of points such that the distance between. In the case where, then this operation is a similarity self-join (and therefore,. In contrast to the majority of the literature that focuses on either the CPU or the GPU, we propose in this paper Heterogeneous CPU-GPU Epsilon Grids Join (HEGJoin), an efficient algorithm to process a distance similarity join using both the CPU and the GPU. We leverage two state-of-the-art algorithms: LBJoin for the GPU and Super-EGO for the CPU. We achieve good load balancing between architectures by assigning points with larger workloads to the GPU and those with lighter workloads to the CPU through the use of a shared work queue. We examine the performance of our heterogeneous algorithm against LBJoin, as well as Super-EGO by comparing performance to the upper bound throughput. We observe that HEGJoin consistently achieves close to this upper bound.
AB - The distance similarity join operation joins two datasets (or tables), A and B, based on a search distance, and returns the pairs of points such that the distance between. In the case where, then this operation is a similarity self-join (and therefore,. In contrast to the majority of the literature that focuses on either the CPU or the GPU, we propose in this paper Heterogeneous CPU-GPU Epsilon Grids Join (HEGJoin), an efficient algorithm to process a distance similarity join using both the CPU and the GPU. We leverage two state-of-the-art algorithms: LBJoin for the GPU and Super-EGO for the CPU. We achieve good load balancing between architectures by assigning points with larger workloads to the GPU and those with lighter workloads to the CPU through the use of a shared work queue. We examine the performance of our heterogeneous algorithm against LBJoin, as well as Super-EGO by comparing performance to the upper bound throughput. We observe that HEGJoin consistently achieves close to this upper bound.
KW - Heterogeneous CPU-GPU computing
KW - Range query
KW - Similarity join
KW - Super-EGO
UR - http://www.scopus.com/inward/record.url?scp=85092106622&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85092106622&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-59419-0_23
DO - 10.1007/978-3-030-59419-0_23
M3 - Conference contribution
AN - SCOPUS:85092106622
SN - 9783030594183
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 372
EP - 388
BT - Database Systems for Advanced Applications - 25th International Conference, DASFAA 2020, Proceedings
A2 - Nah, Yunmook
A2 - Cui, Bin
A2 - Lee, Sang-Won
A2 - Yu, Jeffrey Xu
A2 - Moon, Yang-Sae
A2 - Whang, Steven Euijong
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 24 September 2020 through 27 September 2020
ER -