Tuesday May 5, 2009
Location: HH D-210
Carnegie Mellon University, Electrical & Computer Engineering
The growing disparity of speed between CPU and off-chip memory makes memory latency an inevitable bottleneck in computer performance. This problem becomes more severe for highly data intensive applications, which are becoming prevalent. Although access/execution decoupled architectures in prior research allow the CPU to tolerate long access latencies, the actual implementation and development of such architectures has stalled in past years, mainly because of the lack of software compatibility. However, the memory latency has become a problem more critical than ever before and access decoupling technique should be considered again as one possible remedy.
A decoupled parameterizable architecture (DPA) is introduced in this talk. It is an algorithm-specific architecture to support data-intensive memory-bound algorithms. DPA supports programming data transportation and computation separately. Software–controlled on-chip and off-chip data management prefetches data for vector units, achieves higher data bandwidth and serves multiple compute cores. Also a programming flow from an algorithm level description to machine code is demonstrated in this talk. Several data management approaches in DPA are proposed and evaluated on the BEE2 FPGA board and in RTL-level simulation.
Qian Yu is a post-doctoral researcher in the ECE department of Carnegie Mellon University (CMU). She earned her Ph.D. degree in Electronics Engineering from the Chinese Academy of Sciences in 2006. She worked as a post-doctorate in the University of Illinois at Urbana-Champaign in 2007, and joined CMU since 2008. Her research interests include high-performance digital signal processing and the decoupled architecture for data-intensive applications.