The 17th IEEE International Symposium on Parallel and Distributed Processing and its Applications (ISPA) was held in Xiamen, Fujian province from December 16 to 18, 2019. The research team published the academic paper Fast Schedule Computation on GPU for High Data Reuse and Device Utilization at the conference (Authors: Yuxiang Zhang, Yu Zhang), which was presented by Yuxiang Zhang.
This work proposes an algorithm that can efficiently find a promising schedule to exploit the parallelism and locality of computation on GPU. In particular, an empirical model comprehensively considering locality, load balance and parallelism sufficiency of computation on the given GPU model are designed to measure the quality of a candidate schedule. And empirical constraints are introduced to significantly reduce the searching space of the schedule to polynomial complexity in terms of computation dimension. Compared with the state-of-the-art tool, Tensor Comprehensions, our algorithm can find a promising schedule 5-45x faster, and the corresponding scheduled code runs 1.5-10x faster.