High-performance Graph Algorithms on GPUs
Speaker: Sreepathi Pai, University of Rochester
Abstract:
Massive parallelism and large memory bandwidths make GPUs very attractive for implementing graph algorithms that process large graphs. However, implementing high-performance graph algorithms for GPUs usually required manual coding in CUDA, a difficult process. To significantly lower the level of effort required, we created IrGL, a language and compiler specifically for generating high-performance graph algorithm implementations for GPUs. Powered by three throughput optimizations, the IrGL-generated code outperformed nearly all handwritten graph algorithms achieving speedups of up to 6x.
Freed from the drudgery of writing low-level code, IrGL has allowed us to look at a number of problems revolving around graphs. We've used the high-performance implementations to identify key memory system bottlenecks that limit performance on current GPU architectures. We've also translated graph database queries to IrGL and executed them on GPUs. In the course of extending this to the general problem of subgraph isomorphism (a key primitive in graph databases), we were named GraphChallenge 2017 champions for our implementation of the triangle-counting and k-truss problems. Along the way, we also built Groute, a runtime for asynchronous multi-GPU graph analytics that has achieved order-of-magnitude improvements over existing synchronous implementations.
Many interesting questions still remain unexplored, however, and I will summarize our current efforts in this area.
[Joint work with Keshav Pingali, Tal Ben-Nun, Michael Sutton, M. Amber Hassaan, Chad Voegele, Yi-Shan Lu, Ahmet Celik, Milos Gligoric, Sarfraz Khurshid, Tyler Sorensen and Alastair Donaldson]
Bio:
Sreepathi Pai is an Assistant Professor of Computer Science at the University of Rochester. His research interests are in compilers, programming languages and implementation, performance models and computer architecture. His most recent research has revolved around the IrGL compiler that produces high-performance GPU code for graph algorithms.
He earned his PhD at the Indian Institute of Science and his B.E. in Computer Engineering at the University of Mumbai. Prior to joining the Department of Computer Science at Rochester, he was a Postdoctoral Fellow at The University of Texas at Austin.