The Fourth Workshop on the Intersections of Computer Architecture and Reconfigurable Logic (CARL 2015)

Portland, Oregon - Sunday, June 14, 2015

Oregon Convention Center, Room D-132

Co-located with ISCA 2015

http://www.ece.cmu.edu/calcm/carl

Differences

This shows you the differences between two versions of the page.

--- eric_chung_microsoft_accelerating_deep_convolutional_neural_networks_using_specialized_hardware_in_the_datacenter [2015/05/26 13:02]
jhoe created
+++ eric_chung_microsoft_accelerating_deep_convolutional_neural_networks_using_specialized_hardware_in_the_datacenter [2015/06/18 16:27] (current)
jhoe
@@ Line 3: / Line 3: @@
 ====Accelerating Deep Convolutional Neural Networks Using Specialized Hardware in the Datacenter====
-Text analytics is the process of extracting information from
+({{carl15_chung.pdf |slides}})
-large-scale unstructured text data and has various applications
-in business analytics, healthcare, and security domains. The
+Recent breakthroughs in the development of multi-layer convolutional neural networks have led to state-of-the-art improvements in the accuracy of non-trivial recognition tasks such as large-category image classification and automatic speech recognition.  These many-layered neural networks are large, complex, and require substantial computing resources to train and evaluate.  Unfortunately, these demands come at an inopportune moment due to the recent slowing of gains in commodity processor performance.
-size of the data sets is constantly growing in these domains, and
-extracting, in real time and at high speeds, patterns, correlations,
+Hardware specialization in the form of GPGPUs, FPGAs, and ASIC offers a promising path towards major leaps in processing capability while achieving high energy efficiency.  At Microsoft, an effort is underway to accelerate Deep Convolutional Neural Networks (CNN) using servers in the datacenter augmented with FPGAs.  Initial efforts to implement a single-node CNN accelerator on a mid-range FPGA show significant promise, resulting in respectable performance relative to prior FPGA designs, multithreaded CPU implementations and high-end GPGPUs, at a fraction of the power. In the future, combining multiple FPGAs over a low-latency communication fabric offers further opportunity to train and evaluate models of unprecedented size and quality.
-and insights hidden in these data sets has significant value. To
-address the associated performance and energy-efficiency
-challenges and to exploit the data-access bandwidth more
-efficiently, we use hardware accelerators realized on field
-programmable gate arrays (FPGAs). A key component of our
-approach is a compiler that automatically transforms high-level
-text analytics queries into FPGA-optimized data flow pipelines.
-Our accelerators execute complex rule-based text analytics
-queries, consisting of hundreds of text-processing functions,
-an order of magnitude faster than the multi-threaded software
-implementations running on powerful multi-core processors.
 ==== Speaker Bio ====

Differences

Navigation

Search