HBM 513E - Parallel&Distributed Comput.
Course Objectives
Parallel computing is a broad field of computer science concerned with the architecture, HW/SW systems, languages, programming paradigms, algorithms, and theoretical models that make it possible to compute in parallel. While parallel programming and execution can be studied in the abstract, performance is parallelism's raison d'être. Indeed, parallel computing is the name of the game in high-performance computing. Large-scale parallelism (>10000 processors) lies at the heart of the fastest, most powerful computing systems in the world today. At the same time, small-scale, low-end parallelism is the driving force behind affordable scientific computing and the growing success of computational science.
This class will survey recent advances in parallel algorithms for scientific computing and will aim to discover their impact on improving performance of scientific applications. Since students come from different backgrounds, Lectures will attempt to emphasize an even spread of topics in parallel algorithm design, numerical methods, computer architecture, software and runtime environments. Students will spend time on particular parallel algorithms for scientific problems and their implementations on parallel and distributed machines.
We are going to concentrate upon the message-passing methods of parallel computing and use some standard parallel computing tools such as MPI (Message Passing Interface), OpenMP (Directive-based Threading) for SMP and Distributed Shared-memory Computer Architectures. Basic MPI based parallel algorithms will be practiced through p2p and collective communication messaging. Parallelization of numerical algorithms will be studied and written with both MPI and OpenMP. Finally, students will experience the fundamentals of different approaches to create and manage the parallelism in order to exploit the benefit of state-of-the-art computing technologies.
Course Description
TENTATIVE CONTENT
Week 1: Introduction
? Introduction to computational science
? What is parallelism?
? Why we need powerful computers?
? Units of measure in HPC
? Why powerful computers are parallel?
? Principles of parallel computing
? Locality and parallelism
? The Flynn Taxonomy
? Custer Server Systems
? I/O Needs on Parallel Computers, RAID Concept
? Parallel File Systems
? Driving issues/Trends on CPU
? Introduction to Eclipse
Week 2: Parallel Computing Architectures
? Parallel Programming Models
? Performance
? Speedup factor and efficiency, Amdahl’s and Gustafson’s Laws
? Types of Parallel Computers: Shared Memory Multiprocessor
? Programming Shared Memory Multiprocessors
? Multiprocessor Interconnection Networks
? Distributed Shared Memory
? Network: Latency and Bandwidth
LAB 1: Introduction to HPC Systems: Login and Environmental Setup
Week 3: OS and Memory Hierarchy
? Operating System and Distributed OS
? Linux System process
? Memory hierarchies
? Cache basics
? Application: Matrix multiplication
? Latency
? Impact of memory bandwidth
? Multilevel caches
? Introduction to cache coherence
LAB 2: Introduction to HPC Systems: Unix/Linux Development Tools for HPC
Week 4: Caches, Virtual Memory and Programming Performance
? Cache mapping
? Direct-Mapped cache
? Set Associative cache
? Fully Associative cache
? Replacement Algorithms
? Cache Performance Metrics
? Virtual memory
? Address translation
? Accelerating with TLB’s
? How architecture impacts your programs
? How (and how not) to tune your code
LAB 3: Eclipse I: Software Development Environment
Week 5: Message Passing Computing I
? SIMD and MIMD model
? Point-to-point send and receive communications
? Synchronous and asynchronous message passing
? Simple MPI examples
? Basics of collective communications
? Evaluating parallel programs
LAB 4: Eclipse II: Parallel Tool Platform
Week 6: Message Passing Computing II
? Buffered and non-buffered point-to-point communication
? Collective communications
LAB 5: Intel Software Tools I: Compiler, debugger, profiler and analyzer
Week 7: Parallel Techniques I
? Performance issues
? Embarrassingly Parallel Computations
? Ideal Parallel Computation
? Embarrassingly Parallel Examples
LAB 6: MPI I: Introduction, simple algorithms, send & receive
Week 8: Parallel Techniques II
? Partitioning and Divide-and-Conquer Strategies
? Pipelined Computations
LAB 7: MPI II: Introduction, simple algorithms, send & receive
Week 9: Parallel Techniques III
? Synchronous Computations
? Synchronization
? Data Parallel Computations
? Synchronous Iteration Program Examples
? Solving a System of Linear Equations by Iteration
? Heat Distribution Problem
LAB 8: MPI III: One-sided communication
Week 10: Parallel Techniques IV
? Load Balancing and Termination Detection
? Dynamic Load Balancing (DLB)
? Centralized DLB
? Decentralized DLB
? Load Balancing Using a Line Structure
? Distributed Termination Detection Algorithms
LAB 9: MPI IV: One-sided communication, collective communication
Week 11: Programming with shared memory I
? Basic shared memory architecture
? Differences between a process and threads
? Accessing shared data
? Shared data in systems with caches
? Cache coherence problem
LAB 10: MPI V: Collective communication
Week 12: Programming with shared memory II
? Introduction to cache coherency
? Snoopy cache coherence
? Directory based cache coherence
? Distributed Directory based cache coherence
LAB 11: Intermediate Compiler, debugger, profiler and analyzer tools
Week 13: Programming with shared memory III
? Introduction to OpenMP
? Directives and variable sharing
? Programming samples
? Multi-core systems and programming
LAB 12: OpenMP I: Introduction
Week 14: Programming with shared memory IV
? Advanced OpenMP
? OpenMP and Machine Architecture
? Performance: Vectorization
? OpenMP for Heterogeneous Computing
LAB 13: OpenMP II: Introduction & Applications
LAB 14: MPI and OpenMP: Applications, Performance and Tuning
|
|
Course Coordinator
Mustafa Serdar Çelebi
Course Language
English
|
|
|