the Store Queue
Tuesday April 3, 2007
Hamerschlag Hall D-210
Carnegie Mellon University
Modern processors use out-of-order execution to improve performance.
However, supporting structures limit the size of the out-of-order
instruction window. The store queue, responsible for forwarding values
from stores to later loads, is one such structure. Current store queue
implementations use content-addressable memory (CAM) that does not scale
to the size required for large instruction windows. Recent research has
proposed mechanisms to improve the scalability of store queues, based on
the observation that store-to-load forwarding typically occurs between
repetitive pairs of instructions.
In this talk, I will present two concurrent proposals from MICRO 2006 that
advocate completely eliminating the store queue. Fire-and-Forget, from the
Georgia Institute of Technology, predicts which stores will forward a
value to a load within the instruction window, and writes the value
directly into the correct load queue entry. NoSQ, from the University of
Pennsylvania, predicts which loads will consume a forwarded value, and
performs forwarding through the register file. Both techniques
successfully solve the store queue scalability problem, without degrading
performance relative to a conventional fully-associative store queue.
Stephen Somogyi is a Ph.D. candidate in Electrical and Computer
Engineering at Carnegie Mellon University, working with Prof. Babak
Falsafi. His research interests focus on memory streaming techniques to
improve the performance of future computer systems.