Link to CALCM Home  

Eliminating the Store Queue

Tuesday April 3, 2007
Hamerschlag Hall D-210
4:30 pm

Stephen Somogyi
Carnegie Mellon University

Modern processors use out-of-order execution to improve performance. However, supporting structures limit the size of the out-of-order instruction window. The store queue, responsible for forwarding values from stores to later loads, is one such structure. Current store queue implementations use content-addressable memory (CAM) that does not scale to the size required for large instruction windows. Recent research has proposed mechanisms to improve the scalability of store queues, based on the observation that store-to-load forwarding typically occurs between repetitive pairs of instructions.

In this talk, I will present two concurrent proposals from MICRO 2006 that advocate completely eliminating the store queue. Fire-and-Forget, from the Georgia Institute of Technology, predicts which stores will forward a value to a load within the instruction window, and writes the value directly into the correct load queue entry. NoSQ, from the University of Pennsylvania, predicts which loads will consume a forwarded value, and performs forwarding through the register file. Both techniques successfully solve the store queue scalability problem, without degrading performance relative to a conventional fully-associative store queue.

Stephen Somogyi is a Ph.D. candidate in Electrical and Computer Engineering at Carnegie Mellon University, working with Prof. Babak Falsafi. His research interests focus on memory streaming techniques to improve the performance of future computer systems.


Department of Electrical and Computer EngineeringCarnegie Mellon UniversitySchool of Computer Science