|
|
|
|
|
|
|
|
|
|
|
|
Goals Click here for the PUMA2 Talk @ UW, Seattle on 6/6/2002 Current contributions: Relaxing Memory Order Using Transactional Execution This work proposes to
wait-free implementations of memory consistency models in which efficient
hardware for checkpointing and recovery allows for relaxing the memory order
dynamically, obviating the need for waiting for store acknowledgements or
memory fence/barrier instructions in the common case where there are no races
among processors to enter a critical section. A wait-free sequentially-consistent
system can improve performance over a conventional release-consistent system. Dead-Block Predictors & Dead-Block Correlating
Prefetchers This work
proposes instruction-trace-based predictors that track repetitive instruction
sequences from a cache fill to an eviction to: (1) predict the eviction early
and replace the current block, and (2) subsequently fetch a to-be-reference
block. These predictors replace/fetch data orders of magnitude in latency in
advance of a processor reference to virtually hide all of memory access
latency. Self-Invalidation Using Last-Touch Prediction This work
proposes instruction-trace based predictors that track repetitive instruction
sequences in shared-memory multiprocessor to predict and self-invalidate a
shared cache block early. Early self-invalidation improves performance by
turning a three-hop coherence protocol transaction between a producer and a
consumer to a 2-hop transaction. Memory Sharing Predictors This work
proposes sharing signature based predictors that track repetitive memory
sharing patterns in a shared-memory multiprocessor. |