A framework for accelerating bottlenecks in GPU execution with assist warps
N. Vijaykumar1; G. Pekhimenko1; A. Jog2; S. Ghose1; A. Bhowmick1; R. Ausavarungnirun1; C. Das2; M. Kandemir2; T.C. Mowry1; O. Mutlu1 1 Carnegie Mellon University, Pittsburgh, PA, United States2 Pennsylvania State University, State College, PA, United States
Abstract
Modern graphics processing units (GPUs) are well provisioned to support the concurrent execution of thousands of threads. Unfortunately, different bottlenecks during execution and heterogeneous application requirements create imbalances in utilization of resources in the cores. For example, when a GPU is bottlenecked by the available off-chip memory bandwidth, its computational resources ...
Get Advances in GPU Research and Practice now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.