The k-means clustering algorithm is widely employed for unsupervised learning. The algorithm takes as input a multidimensional dataset of points and number of clusters/centroids, k, where each point is assigned to one of the clusters. For exact k-means clustering, the algorithm must compute the same result as Lloyd's algorithm, which is well-known to be computationally expensive due to the large number of distance comparisons between each point and the k centroids. Several algorithms have been proposed for k-means clustering that avoid distance calculations but produce an exact result. However, these algorithms have all been designed for execution using the CPU, and no published works have examined using the GPU to accelerate k-means while simultaneously avoiding distance calculations. This paper examines the state-of-the-art Yinyang algorithm that avoids distance calculations as executed on the GPU. Since Lloyd's algorithm is well-suited to a GPU execution, it is not clear whether the Yinyang algorithm will obtain significant performance gains on GPU hardware. In this context, this paper: (i) proposes the first GPU-accelerated Yinyang algorithm in the literature; (ii) advances several optimizations to GPU kernels; (iii) contrasts and evaluates different degrees of distance calculation pruning; and, (iv) compares the performance of our GPU-accelerated Yinyang algorithm to four reference implementations. Our GPU algorithm achieves a speedup over the multi-core CPU Yinyang algorithm of up to 8× on real-world datasets.