Friday Nov. 13, 2015
Location: HH D210
The power wall has spurred a flurry of interest in developing heterogeneous systems with hardware accelerators. The questions are what and how accelerators should be designed, and what software support is required. Our accelerator design approach stems from the observation that many efficient and portable software implementations rely on high performance software libraries with well-established application programming interfaces (APIs). We propose the integration of hardware accelerators on 3D-stacked memory that explicitly targets the memory-bounded operations within high performance libraries. The fixed APIs with limited configurability simplify the design of the accelerators, while ensuring that the accelerators have wide applicability. With our software support that automatically converts library APIs to accelerator invocations, an additional advantage of our approach is that library-based legacy code automatically gains the benefit of memory-side accelerators without requiring a reimplementation. The legacy code using our proposed hardware library improves energy efficiency for individual operations in Intel's Math Kernel Library (MKL) by 75x. We also demonstrate that the energy efficiency of a real-world signal processing application, Space Time Adaptive Processing (STAP), can be improved by 10x, with the proposed software/hardware library-based approach.
Qi Guo is a postdoctoral researcher in the Electrical & Computer Engineering Department at Carnegie Mellon University, advised by Professor Franz Franchetti. His research interests are in the area of energy-efficient computer architecture, performance modeling and evaluation.