18-548/15-548 Fall 1998

Homework 6:
Cache Memory Size/Speed Tradeoff

Due Wednesday October 14, 1998

Multilevel Caches

Problem 1:

You have a computer with two levels of cache memory and the following specifications:

CMDLINE: dinero -b32 -i8K -d8K -a1 -ww -An -W8 -B8
CACHE (bytes): blocksize=32, sub-blocksize=0, wordsize=8, Usize=0, Dsize=8192, Isize=8192, bus-width=8.
POLICIES: assoc=1-way, replacement=l, fetch=d(1,0), write=w, allocate=n.
CTRL: debug=0, output=0, skipcount=0, maxcount=10000000, Q=0.

Metrics               Access Type:
(totals,fraction)     Total    Instrn   Data    Read   Write    Misc
-----------------     ------   ------  ------  ------  ------  ------
Demand Fetches        10000000 7362210 2637790 1870945 766845       0
                      1.0000   0.7362  0.2638  0.1871  0.0767  0.0000
Demand Misses          52206     8466   43740   36764    6976       0
                      0.0052   0.0011  0.0166  0.0196  0.0091  0.0000

Words From Memory      180920
( / Demand Fetches)    0.0181
Words Copied-Back      766845
( / Demand Writes)     1.0000
Total Traffic (words)  947765
( / Demand Fetches)    0.0948

1) What is the available (as opposed to used) sustained bandwidth:

2) How long does an average instruction take to execute (in ns), assuming 1 clock cycle per instruction in the absence of memory hierarchy stalls, no write buffering at the L1 cache level, and 0% L2 miss rate?

3) A design study is performed to examine replacing the L2 cache with a victim cache. Compute a measure of speed for each alternative and indicate which is the faster solution. Assume the performance statistics are:

System Level Effects

Problem 2:

1) A Ph.D. student has snuck onto the course machines to run a long simulation. That task is suspended while a '548 student runs a cache-wiping homework problem, casing all data from the simulation to be expelled from cache. What is the approximate time penalty, in clocks, associated with refilling the caches when the simulation resumes execution? A restating of this same question is: assuming that the simulation runs to completion after it is restarted, how much longer (in clocks charged to that particular task) will it take to run than if it had not been interrupted?

L1 Cache L2 Cache L3 Cache
Organization split unified unified
Size 8KB data + 8 KB instr. 96 KB 8 MB
Associativity direct mapped 3-way set direct mapped
Blocks per sector 2 2 2
Words per block 4 4 4
Write policy write through write back write back
Write allocation no yes yes
Hit time 1 clock 4 clocks 12 clocks
Total miss time 4 clocks 12 clocks 90 clocks
Local miss ratio 0.13 (same for D & I) 0.04 0.02

18-548/15-548 home page.