Differences

This shows you the differences between two versions of the page.

--- buzzwords [2010/11/11 05:13]
lsubrama
+++ buzzwords [2010/12/01 23:06] (current)
lsubrama
@@ Line 311: / Line 311: @@
 ===== Lecture 19 ====
+  Main memory system
-Main memory system
   * Memory hierarchy
@@ Line 330: / Line 329: @@
     - Memory controller placement
-==== Lecture 20 =====
+===== Lecture 20 =====
   * DRAM controller functions
@@ Line 355: / Line 354: @@
 ===== Lecture 21 ====
-Super scalar processing (I)
+  Super scalar processing I
   * Types of parallelism
@@ Line 374: / Line 373: @@
 ===== Lecture 22 ====
-Super scalar processing (II)
+  Super scalar processing II
   * Trace Caches
@@ Line 400: / Line 399: @@
       - Micro op sequencer
     - Instruction buffering fetch and decode
+===== Lecture 23 ====
+  Superscalar Processing III
+  * Renaming multiple instructions
+    - dependency check logic (n^2 comparators)
+    - help from compiler
+      * ensure instructions are independent (difficult for wide fetches)
+      * hardware-software co-design to simplify dependency logic
+  * Dispatching multiple instructions
+    - wakeup logic (compare all tags in reservation station with all the tags that are broadcast)
+    - select logic (hierarchical tree based selection)
+  * Execute
+    - enough execution units
+    - enough forwarding paths (broadcast tag/value to all functional units)
+  * Reducing dispatch+bypass delays
+    - clustering (divide window into multiple clusters)
+    - intra-cluster bypass is fast
+    - inter-cluster bypass can be slow
+  * Register file
+    - need multiple reads/writes per cycle
+    - Replicate or partition the register files
+    - using block-structured ISA
+  * Retirement
+    - updating architectural register map
+===== Lecture 24 ====
+  Control Flow
+  * Problem of branches
+  * Types
+    * conditional, unconditional, call, return, indirect branches
+  * Handling conditional branches
+  * Predicate combining
+    * condition codes vs condition registers
+  * Delayed branching
+  * Fine-grained multi-threading
+  * Branch prediction
+    * predicting if an instruction is a branch (predecoding)
+    * predicting the direction of the branch
+    * predicting the target address of a branch
+  * Static branch predition
+    * always taken/not taken
+    * backward taken, forward not taken
+    * by compiler based on profiling
+  * Dynamic branch prediction
+    * last time predictor
+    * history based predictors
+    * two-level predictors
+===== Lecture 25 ====
+  Control Flow - II
+  * 2-bit counter based prediction
+  * Global branch prediction
+  * Global branch correlation
+  * Global two-level prediction
+    - Global history register
+  * Local two-level prediction
+    - Pattern history table
+    - Interference in the pattern history table
+      - Randomizing the index into the pattern history table
+      - Agree prediction
+  * Alpha 21264 Tournament Predictor
+  * Perceptron branch predictor
+    - Perceptron - learns a target boolean function of N inputs
+  * Call and Return Prediction
+  * Indirect branch prediction
+    - Virtual Conditional Branch prediction
+  * Branch prediction issues
+    - Need to know a branch as soon as it is fetched
+    - Latency
+    - State recovery upon misprediction
+  * Predicated execution
+==== Lecture 26 ====
+  Control Flow - III & Concurrency
+  * Predicated Execution
+    - Predication decisions at the compiler
+    - Rename stage modifications
+  * Limitations of predication
+    - Adaptivity
+    - Complex Control Flow Graphs
+    - ISA support
+  * Wish branches
+    - Wish jump/join
+    - Wish loop
+  * Wish branches vs Predicated Execution
+  * Wish branches vs Branch prediction
+  * Diverge-Merge Processor
+  * Dynamic-Hammock
+  * Multi-path Execution
+  * Research issues in control flow handling
+    - Hardware/software cooperation
+    - Fetch gating
+    - Recycling useful work done on wrong path
+  Concurrency
+  * Classification of machines
+    - SISD
+    - SIMD
+    - MIMD
+  * Decoupled Access/Execute
+  * Astronautics ZS-1
+  * Loop unrolling
+==== Lecture 27 ====
+  VLIW
+  * Each VLIW instruction - a bundle of independent instructions (identified by compiler)
+  * Each instruction bundle executed by hardware in lockstep
+  * Commercial VLIW machines
+     - TIC6000, Trimedia, STMicro
+  * Intel IA-64 - Partially VLIW
+  * Encoding VLIW NOPs
+  * Static Instruction Scheduling for VLIW
+  * Code motion - Safety & Legality
+  * Trace scheduling
+  * List scheduling
+  * Super block scheduling
+  * Hyperblock scheduling
+  * The Intel IA-64 architecture
+     - No lock step execution of a bundle
+     - Specify dependencies between instructions within a bundle
+     - Template bits
+  * What hinder static mode motion?
+     - Exceptions
+     - Loads/Stores

Differences

Views

Navigation

Personal Tools

Search

Toolbox