Advance Computer architecture

Last Updated on by Prince Pudasaini

Course Title: Advanced Computer Architecture
Full Marks: 45 + 30
Course No: C.Sc. 546
Pass Marks: 22.5 + 15
Nature of the Course: Theory + Lab
Credit Hrs: 3

Course objectives:

This course is designed to provide information of state‐of‐the‐art high performance computer architectures. Topics include performance, ISA, instruction‐level parallelism (ILP), Data level parallelism (DLP), thread‐level parallelism (TLP), dynamic scheduling, out‐of‐order execution, register renaming, static scheduling (VLIW/EPIC), cache/memory hierarchy design, speculation techniques, advanced branch predictor design, multiprocessor, coherency issues, multicore processors, popular design case Studies, trends in architecture/microarchitecture development in face of physical design limit.

Course Prerequisite:

Computer Organization and Operating System or the equivalent.

Course Contents:

Unit 1: Fundamentals of Computer Design (10 Hrs)

Unit 2: Memory Hierarchy Design (10 Hrs)

Classification of cache organization, Cache hierarchy design, Quantifying Cache performance, measuring average memory access time, Cache optimizations techniques: basic to advanced techniques to reduce miss penalty, miss rate and hit time, trace caches, memory technology and optimizations, Main Memory Organization and optimization, virtual memory design. Case study: Memory Hierarchies in the ARM Cortex‐A8 and Intel Core i7.

Unit 3: Instruction Level Parallelism (12 Hrs)

Instruction Level Parallelism(ILP) overview, types of hazards, Data dependencies, Name dependencies, Control dependencies, instruction scheduling, Branch instruction costs, register renaming, basic compiler technique for exposing ILP: instruction scheduling and loop unrolling, Dynamic hardware prediction, Dynamic instruction scheduling, Hardware based Speculation, Multiple Issue and static scheduling, Superscalar and VLIW or EPIC architectures, Limitations of Instruction Level Parallelism,Multi threading: thread‐level parallelism to improve uniprocessor throughput. Case Study: The Intel Core i7 and ARM Cortex‐A8. Unit 4: Data Level Parallelism(5 Hrs) Data Level Parallelism (DLP) overview, vector processors architecture, SIMD Instruction Set Extensionsfor Multimedia, Graphics Processing Units, Detecting and Enhancing Loop‐Level Parallelism. Case study:Mobile versus Server GPUs and Tesla versus Core i7.

Unit 4: Multiprocessor and Thread Level Parallism (8 Hrs)

Multicore processors Architectures: SIMD, MIMD, Centralized Shared Memory, Distributed shared memory architecture, Cache Coherence, directory‐based coherence, Synchronization, Memory Consistency models, Case Study: Multicore Processors and Their Performance.

Text Book:

Hennessy J.L., Patterson D.A., Computer Architecture: A Quantitative Approach, Morgan Kaufmann, 5th edition.

References:

  1. Hennessy J.L., Patterson D.A., Computer Architecture: A Quantitative Approach, Morgan Kaufmann, 4th edition.
  2. David A. Patterson and John L. Hennessy, Computer Organization and Design: The Hardware / Software Interface. 3. Computer architecture and embedded processor research articles and technical papers.

More From Author

Algorithmic Mathematics

Algorithms and Complexity