
Girish Venkataramani
Carnegie Mellon University

Stalls, due to mismatches in communication rates, are a major performance obstacle in pipelined circuits. If the rate of data production is faster than the rate of consumption, the resulting design performs slower than when the communication rate is matched. This can be remedied by inserting pipeline buffers (to temporarily hold data), allowing the producer to proceed if the consumer is not ready to accept data. The problem of deciding which channels need these buffers (and how many) for an arbitrary communication profile is called the slack matching problem; the optimal solution to this problem has been shown to be NPcomplete.
In this talk, I will present a heuristic that uses knowledge of the communication protocol to explicitly model these bottlenecks, and an iterative algorithm to progressively remove these bottlenecks by inserting buffers. I will show how this approach naturally handles large designs with arbitrarily cyclic and acyclic topologies, and which exhibit various types of control choice. The heuristic is efficient, achieving linear time complexity in practice, and produces solutions that (a) achieve up to 60% performance speedup on large media processing kernels, and (b) can either be verified to be optimal, or the approximation margin can be bounded. I will be presenting this paper at ICCAD 2006.
As a PhD candidate, Girish has been spending the last few years deconstructing the nature of asynchronous circuits, in order to construct highly optimized ones, within the automated CASH synthesis system. He is interested in hardware compilation, timing analysis of asynchronous circuits, and system architecture of specialized reconfigurable fabrics. He is part of the Phoenix team supervised by his advisor, Seth Goldstein.
