MMM for a DSP
Tuesday September 6, 2005
Hamerschlag Hall D-210
Carnegie Mellon University
Digital signal processors (DSPs) have very distinct microarchitectures
as compared to general purpose CPUs. Yet, many computer architects
know little about DSP architectures, despite their greater pervasiveness.
I will discuss one of the most popular DSP architectures, the Texas
Instruments C6000 series architecture in the context of optimizing
matrix-matrix multiply (MMM) performance. Conventional MMM optimizations
for general purpose CPUs are not always beneficial on DSPs, and
DSP specific features must be leveraged to achieve peak performance.
I will show how using a software-managed memory hierarchy and direct
memory access engine can enable significant MMM speedups.
Roland Wunderlich is a PhD candidate in Electrical and Computer
Engineering at Carnegie Mellon University. Roland works on computer
architecture and automated optimization of DSP software with the
Spiral project. He is advised by Professor James Hoe. Roland received
his BS in Computer Engineering from Rutgers University, and his
MS degree from Carnegie Mellon.