Toward Accelerating Deep Learning at Scale Using Specialized Hardware in the Datacenter

Tuesday Oct. 20, 2015
Location: CIC Panther Hollow Room
Time: 4:30PM


Eric Chung
Microsoft Research

Abstract

Recent breakthroughs in the development of multi-layer convolutional neural networks have led to state-of-the-art improvements in the accuracy of non-trivial recognition tasks such as large-category image classification and automatic speech recognition. These many-layered neural networks are large, complex, and require substantial computing resources to train and evaluate. Unfortunately, these demands come at an inopportune moment due to the recent slowing of gains in commodity processor performance. Hardware specialization in the form of GPGPUs, FPGAs, and ASIC offers a promising path towards major leaps in processing capability while achieving high energy efficiency. At Microsoft, an effort is underway to accelerate Deep Convolutional Neural Networks (CNN) using servers in the datacenter augmented with FPGAs. Initial efforts to implement a single-node CNN accelerator on a mid-range FPGA show significant promise, resulting in respectable performance relative to prior FPGA designs, multithreaded CPU implementations and high-end GPGPUs, at a fraction of the power. In the future, combining multiple FPGAs over a low-latency communication fabric offers further opportunity to train and evaluate models of unprecedented size and quality.

Bio

Eric Chung is a Researcher at Microsoft. He is broadly interested in computer architecture, reconfigurable computing, high level hardware design, datacenters, and machine learning at scale. Since 2012, Eric has been a core member and contributor to the Catapult project at Microsoft, which uses a fabric of FPGAs to accelerate cloud services at scale in the datacenter. Previously, Eric received his PhD in electrical and computer engineering from Carnegie Mellon University in 2011 and a BS in EECS from UC Berkeley in 2004.



Back to the seminar page