Advanced Computer Architecture, Comparative Architectures Past Paper
Textbook: Hennessy, J. and Patterson, D. (2012). Computer architecture: a quantitative approach. Elsevier (3rd/4th/5th ed.)
Early computers exploit Bit-Level Parallelism.
Two-level branch predictor
Branch Target Buffer (BTB)
Multiple / Diversified pipelines
Improvements
Exploits Instruction-Level Parallelism.
Instruction Fetch, DEcode, [Rename], [Dispatch: static/dynamic scheduling], [Issue], EXecute, Memory, Write Back.
In-order dispatch: stall for data hazards; In-order issue: stall for structural hazards.
In-order vs out-of-order dispatch & execution
Rename: arch (or logical) register physical of the last destination targeted.
Hardware-based speculation
Load bypassing
Store queue / buffer
ReOrder Buffer (ROB) [precise exceptions]
Exploits Instruction-Level Parallelism via static scheduling (out-of-order).
Memory reference speculation
Variable-length bundles of independent instructions
Exploits Thread-Level Parallelism.
Coarse-grained MT
Fine-grained MT
SMT
cache
Exploits Data-Level Parallelism.
Exploits Chip Multiprocessing.
Cache coherence: snoopy protocol
Cache coherence: directory protocol
Memory consistency: SC vs TSO, store atomicity
On-chip interconnection network
Exploits Accelerator-Level Parallelism.
Domain-Specific Accelerators (DSA)