Streaming: Cooperative Mechanisms to Eliminate Coherent Read Misses
in a DSM
Tuesday April 1, 2003
Hamerschlag Hall D-210
Carnegie Mellon University
Modern shared-memory mulitprocessor systems employ many techniques to
improve the performance of memory coherence activities. Much as branch
predictors in modern processors rely on repetitive branch outcomes in
programs to speculatively execute instructions past branches, coherence
predictors rely on repetitive sharing patterns in applications to predict
coherence activity. Using accurate and timely predictors, a DSM can speculatively
trigger protocol operations in advance to hide coherence overhead and
Store-Ordered Streaming relies on the key observation that cache blocks
are often consumed in the same order they are produced in distributed
shared-memory systems. SORDS comprises a set of four co-operative mechanisms
that target repetitive and predictable shared misses in such a DSM: Downgrade
Predictors, Memory Read Predictors, Forward Streams, and Forward Queues.
Together, these allow for accurate prediction of "when" a cache
block has been produced, "who" will subsequently consume it,
and a method for efficiently streaming blocks in correct order from producer
to consumer. In this presentation, I will review Babak Falsafi and Chris
Gniady's paper of the same name.
Stephen Somogyi is a first year graduate student in Electrical & Computer
Engineering at Carnegie Mellon University, advised by Babak Falsafi.
His research interests revolve around computer architecture, covering
both uni- and multiprocessor systems, and techniques for overcoming the
limitations imposed by traditional memory hierarchies.