Automatic Library
Generation for Signal Transforms
Tuesday November 6, 2007
Hamerschlag Hall D-210
4:30 pm
Yevgen Voronenko
Carnegie Mellon University
In the fast changing world of computing platforms the quest for best
performance
begins with the vendor performance libraries which implement common
compute intensive
tasks. Implementing such performance libraries is usually expensive and
time-consuming, since many low level aspects of the hardware must be
taken into
account. Earlier we have demonstrated that parts of the library
implementation and
performance tuning process in the domain of linear signal transforms
(such as the
discrete Fourier transform, FIR filters, and others) can be automated
using the code
generator Spiral. The Spiral system enables automatically generating
high-performance vectorized, parallelized, and cache-aware
implementations of many
signal transforms from simple high-level declarative algorithm
descriptions. The
main limitation of Spiral is ability to generate code only for fixed
size transforms.
In many applications the size of transforms is fixed, and ability to
generate a
single specific size is very convenient. For example, JPEG compression
requires an
8x8 2D DCT.
However, for a high-performance *reusable* library, one typically needs an
implementation which can compute any size. We present a formal framework
which
enables Spiral to generate such "general size" code. Our method is based on
domain-specific representations of transform problem specifications and
an iterative
method to compute the so-called "recursion step closure" or the minimal
set of
mutually recursive functions, sufficient to compute the given problem.
Our method
is compatible with automatic vectorization and parallelization of
Spiral, and enables
complete automation of library implementation for signal transforms.
While the earlier work on Spiral concentrated on the lower level performance
optimizations, we contribute the high level analyses to make the next
generation
Spiral a vertically integrated library implementation and performance tuning
environment.
Yevgen Voronenko is a Ph.D. candidate at ECE. He hold a B.S. degree in
computer science from Drexel University. His interests
include code generation, compiler optimizations, and software architecture.
|