Advanced Computer Architecture, Comparative Architectures Past Paper

Textbook: Hennessy, J. and Patterson, D. (2012). Computer architecture: a quantitative approach. Elsevier (3rd/4th/5th ed.)

Early computers exploit Bit-Level Parallelism.

Scalar pipeline

Two-level branch predictor

Branch Target Buffer (BTB)

Multiple / Diversified pipelines

Improvements

Superscalar pipeline

Exploits Instruction-Level Parallelism.

Instruction Fetch, DEcode, [Rename], [Dispatch: static/dynamic scheduling], [Issue], EXecute, Memory, Write Back.
In-order dispatch: stall for data hazards; In-order issue: stall for structural hazards.

In-order vs out-of-order dispatch & execution

Rename: arch (or logical) register \rightarrow physical of the last destination targeted.

Hardware-based speculation

Load bypassing

Store queue / buffer

ReOrder Buffer (ROB) [precise exceptions]

Software ILP (VLIW)

Exploits Instruction-Level Parallelism via static scheduling (out-of-order).

Memory reference speculation

Variable-length bundles of independent instructions

Multi-threaded processors

Exploits Thread-Level Parallelism.

Coarse-grained MT

Fine-grained MT

SMT

Memory hierarchy

cache

Vector processors / SIMD

Exploits Data-Level Parallelism.

Multi-core processors

Exploits Chip Multiprocessing.

Cache coherence: snoopy protocol

Cache coherence: directory protocol

Memory consistency: SC vs TSO, store atomicity

On-chip interconnection network

Specialised processors

Exploits Accelerator-Level Parallelism.

Domain-Specific Accelerators (DSA)