chapter1

Август 21, 2022

Содержание

2. Copyright © 2010, Elsevier Inc. All rights Reserved Roadmap Why we need ever-increasing performance. Why we’re
3. Changing times Copyright © 2010, Elsevier Inc. All rights Reserved From 1986 – 2002, microprocessors were
4. An intelligent solution Copyright © 2010, Elsevier Inc. All rights Reserved Instead of designing and building
5. Now it’s up to the programmers Adding more processors doesn’t help much if programmers aren’t aware
6. Why we need ever-increasing performance Computational power is increasing, but so are our computation problems and
7. Climate modeling Copyright © 2010, Elsevier Inc. All rights Reserved
8. Protein folding Copyright © 2010, Elsevier Inc. All rights Reserved
9. Drug discovery Copyright © 2010, Elsevier Inc. All rights Reserved
10. Energy research Copyright © 2010, Elsevier Inc. All rights Reserved
11. Data analysis Copyright © 2010, Elsevier Inc. All rights Reserved
12. Why we’re building parallel systems Up to now, performance increases have been attributable to increasing density
13. A little physics lesson Smaller transistors = faster processors. Faster processors = increased power consumption. Increased
14. Solution Move away from single-core systems to multicore processors. “core” = central processing unit (CPU) Copyright
15. Why we need to write parallel programs Running multiple instances of a serial program often isn’t
16. Approaches to the serial problem Rewrite serial programs so that they’re parallel. Write translation programs that
17. More problems Some coding constructs can be recognized by an automatic program generator, and converted to
18. Example Compute n values and add them together. Serial solution: Copyright © 2010, Elsevier Inc. All
19. Example (cont.) We have p cores, p much smaller than n. Each core performs a partial
20. Example (cont.) After each core completes execution of the code, is a private variable my_sum contains
21. Example (cont.) Once all the cores are done computing their private my_sum, they form a global
22. Example (cont.) Copyright © 2010, Elsevier Inc. All rights Reserved
23. Example (cont.) Copyright © 2010, Elsevier Inc. All rights Reserved Global sum 8 + 19 +
24. Copyright © 2010, Elsevier Inc. All rights Reserved But wait! There’s a much better way to
25. Better parallel algorithm Don’t make the master core do all the work. Share it among the
26. Better parallel algorithm (cont.) Repeat the process now with only the evenly ranked cores. Core 0
27. Multiple cores forming a global sum Copyright © 2010, Elsevier Inc. All rights Reserved
28. Analysis In the first example, the master core performs 7 receives and 7 additions. In the
29. Analysis (cont.) The difference is more dramatic with a larger number of cores. If we have
30. How do we write parallel programs? Task parallelism Partition various tasks carried out solving the problem
31. Professor P Copyright © 2010, Elsevier Inc. All rights Reserved 15 questions 300 exams
32. Professor P’s grading assistants Copyright © 2010, Elsevier Inc. All rights Reserved TA#1 TA#2 TA#3
33. Division of work – data parallelism Copyright © 2010, Elsevier Inc. All rights Reserved TA#1 TA#2
34. Division of work – task parallelism Copyright © 2010, Elsevier Inc. All rights Reserved TA#1 TA#2
35. Division of work – data parallelism Copyright © 2010, Elsevier Inc. All rights Reserved
36. Division of work – task parallelism Copyright © 2010, Elsevier Inc. All rights Reserved Tasks Receiving
37. Coordination Cores usually need to coordinate their work. Communication – one or more cores send their
38. What we’ll be doing Learning to write programs that are explicitly parallel. Using the C language.
39. Type of parallel systems Shared-memory The cores can share access to the computer’s memory. Coordinate the
40. Type of parallel systems Copyright © 2010, Elsevier Inc. All rights Reserved Shared-memory Distributed-memory
41. Terminology Concurrent computing – a program is one in which multiple tasks can be in progress
42. Concluding Remarks (1) The laws of physics have brought us to the doorstep of multicore technology.
44. Скачать презентацию

Слайд 2

Copyright © 2010, Elsevier Inc. All rights Reserved
Roadmap
Why we need ever-increasing

performance.
Why we’re building parallel systems.
Why we need to write parallel programs.
How do we write parallel programs?
What we’ll be doing.
Concurrent, parallel, distributed!

# Chapter Subtitle

Слайд 3

Changing times
Copyright © 2010, Elsevier Inc. All rights Reserved
From 1986 –

2002, microprocessors were speeding like a rocket, increasing in performance an average of 50% per year.
Since then, it’s dropped to about 20% increase per year.

Слайд 4

An intelligent solution
Copyright © 2010, Elsevier Inc. All rights Reserved
Instead of

designing and building faster microprocessors, put multiple processors on a single integrated circuit.

Слайд 5

Now it’s up to the programmers
Adding more processors doesn’t help much

if programmers aren’t aware of them…
… or don’t know how to use them.
Serial programs don’t benefit from this approach (in most cases).

Слайд 6

Why we need ever-increasing performance
Computational power is increasing, but so are

our computation problems and needs.
Problems we never dreamed of have been solved because of past increases, such as decoding the human genome.
More complex problems are still waiting to be solved.

Слайд 7

Climate modeling
Copyright © 2010, Elsevier Inc. All rights Reserved

Слайд 8

Protein folding
Copyright © 2010, Elsevier Inc. All rights Reserved

Слайд 9

Drug discovery
Copyright © 2010, Elsevier Inc. All rights Reserved

Слайд 10

Energy research
Copyright © 2010, Elsevier Inc. All rights Reserved

Слайд 11

Data analysis
Copyright © 2010, Elsevier Inc. All rights Reserved

Слайд 12

Why we’re building parallel systems
Up to now, performance increases have been

attributable to increasing density of transistors.
But there are inherent problems.

Слайд 13

A little physics lesson
Smaller transistors = faster processors.
Faster processors = increased

power consumption.
Increased power consumption = increased heat.
Increased heat = unreliable processors.

Слайд 14

Solution
Move away from single-core systems to multicore processors.
“core” = central

processing unit (CPU)

Introducing parallelism!!!

Слайд 15

Why we need to write parallel programs
Running multiple instances of a

serial program often isn’t very useful.
Think of running multiple instances of your favorite game.
What you really want is for it to run faster.

Слайд 16

Approaches to the serial problem
Rewrite serial programs so that they’re parallel.
Write

translation programs that automatically convert serial programs into parallel programs.
This is very difficult to do.
Success has been limited.

Слайд 17

Example
Compute n values and add them together.
Serial solution:
Copyright © 2010, Elsevier

Слайд 19

Example (cont.)
We have p cores, p much smaller than n.
Each core

performs a partial sum of approximately n/p values.

Each core uses it’s own private variables
and executes this block of code independently of the other cores.

Слайд 20

Example (cont.)
After each core completes execution of the code, is a

private variable my_sum contains the sum of the values computed by its calls to Compute_next_value.
Ex., 8 cores, n = 24, then the calls to Compute_next_value return:

1,4,3, 9,2,8, 5,1,1, 5,2,7, 2,5,0, 4,1,8, 6,5,1, 2,3,9

Слайд 21

Example (cont.)
Once all the cores are done computing their private my_sum,

they form a global sum by sending results to a designated “master” core which adds the final result.

Слайд 22

Example (cont.)
Copyright © 2010, Elsevier Inc. All rights Reserved

Слайд 23

Example (cont.)
Copyright © 2010, Elsevier Inc. All rights Reserved
Global sum
8 +

19 + 7 + 15 + 7 + 13 + 12 + 14 = 95

Слайд 24

Copyright © 2010, Elsevier Inc. All rights Reserved
But wait!
There’s a much

better way to compute the global sum.

Слайд 25

Better parallel algorithm
Don’t make the master core do all the work.
Share

it among the other cores.
Pair the cores so that core 0 adds its result with core 1’s result.
Core 2 adds its result with core 3’s result, etc.
Work with odd and even numbered pairs of cores.

Слайд 26

Better parallel algorithm (cont.)
Repeat the process now with only the evenly

ranked cores.
Core 0 adds result from core 2.
Core 4 adds the result from core 6, etc.
Now cores divisible by 4 repeat the process, and so forth, until core 0 has the final result.

Слайд 27

Multiple cores forming a global sum
Copyright © 2010, Elsevier Inc. All

rights Reserved

Слайд 28

Analysis
In the first example, the master core performs 7 receives and

7 additions.
In the second example, the master core performs 3 receives and 3 additions.
The improvement is more than a factor of 2!

Слайд 29

Analysis (cont.)
The difference is more dramatic with a larger number of

cores.
If we have 1000 cores:
The first example would require the master to perform 999 receives and 999 additions.
The second example would only require 10 receives and 10 additions.
That’s an improvement of almost a factor of 100!

Слайд 30

How do we write parallel programs?
Task parallelism
Partition various tasks carried

out solving the problem among the cores.
Data parallelism
Partition the data used in solving the problem among the cores.
Each core carries out similar operations on it’s part of the data.

Слайд 31

Professor P
Copyright © 2010, Elsevier Inc. All rights Reserved
15 questions
300 exams

Слайд 32

Professor P’s grading assistants
Copyright © 2010, Elsevier Inc. All rights Reserved
TA#1
TA#2
TA#3

Слайд 33

rights Reserved

TA#1

TA#2

TA#3

100 exams

Слайд 34

rights Reserved

TA#1

TA#2

TA#3

Questions 1 - 5

Questions 6 - 10

Questions 11 - 15

Слайд 35

rights Reserved

Слайд 36

rights Reserved

Tasks
Receiving
Addition

Слайд 37

Coordination
Cores usually need to coordinate their work.
Communication – one or more

cores send their current partial sums to another core.
Load balancing – share the work evenly among the cores so that one is not heavily loaded.
Synchronization – because each core works at its own pace, make sure cores do not get too far ahead of the rest.

Слайд 38

What we’ll be doing
Learning to write programs that are explicitly parallel.
Using

the C language.
Using three different extensions to C.
Message-Passing Interface (MPI)
Posix Threads (Pthreads)
OpenMP

Слайд 39

Type of parallel systems
Shared-memory
The cores can share access to the computer’s

memory.
Coordinate the cores by having them examine and update shared memory locations.
Distributed-memory
Each core has its own, private memory.
The cores must communicate explicitly by sending messages across a network.

Слайд 40

Слайд 41

Terminology
Concurrent computing – a program is one in which multiple

tasks can be in progress at any instant.
Parallel computing – a program is one in which multiple tasks cooperate closely to solve a problem
Distributed computing – a program may need to cooperate with other programs to solve a problem.

Слайд 42

Concluding Remarks (1)
The laws of physics have brought us to the

doorstep of multicore technology.
Serial programs typically don’t benefit from multiple cores.
Automatic parallel program generation from serial program code isn’t the most efficient approach to get high performance from multicore computers.

chapter1

Содержание

Copyright © 2010, Elsevier Inc. All rights ReservedRoadmapWhy we need ever-increasing

Changing timesCopyright © 2010, Elsevier Inc. All rights ReservedFrom 1986 –

An intelligent solutionCopyright © 2010, Elsevier Inc. All rights ReservedInstead of

Now it’s up to the programmersAdding more processors doesn’t help much

Why we need ever-increasing performanceComputational power is increasing, but so are

Climate modelingCopyright © 2010, Elsevier Inc. All rights Reserved

Protein foldingCopyright © 2010, Elsevier Inc. All rights Reserved

Drug discoveryCopyright © 2010, Elsevier Inc. All rights Reserved

Energy researchCopyright © 2010, Elsevier Inc. All rights Reserved

Data analysisCopyright © 2010, Elsevier Inc. All rights Reserved

Why we’re building parallel systemsUp to now, performance increases have been

A little physics lessonSmaller transistors = faster processors.Faster processors = increased

Solution Move away from single-core systems to multicore processors.“core” = central

Why we need to write parallel programsRunning multiple instances of a

Approaches to the serial problemRewrite serial programs so that they’re parallel. Write

More problemsSome coding constructs can be recognized by an automatic program

ExampleCompute n values and add them together.Serial solution:Copyright © 2010, Elsevier

Example (cont.)We have p cores, p much smaller than n.Each core

Example (cont.)After each core completes execution of the code, is a

Example (cont.)Once all the cores are done computing their private my_sum,

Example (cont.)Copyright © 2010, Elsevier Inc. All rights Reserved

Example (cont.)Copyright © 2010, Elsevier Inc. All rights ReservedGlobal sum8 +

Copyright © 2010, Elsevier Inc. All rights ReservedBut wait!There’s a much

Better parallel algorithmDon’t make the master core do all the work.Share

Better parallel algorithm (cont.)Repeat the process now with only the evenly

Multiple cores forming a global sumCopyright © 2010, Elsevier Inc. All

AnalysisIn the first example, the master core performs 7 receives and

Analysis (cont.)The difference is more dramatic with a larger number of

How do we write parallel programs?Task parallelism Partition various tasks carried

Professor PCopyright © 2010, Elsevier Inc. All rights Reserved15 questions300 exams

Professor P’s grading assistantsCopyright © 2010, Elsevier Inc. All rights ReservedTA#1TA#2TA#3

Division of work – data parallelismCopyright © 2010, Elsevier Inc. All

Division of work – task parallelismCopyright © 2010, Elsevier Inc. All

Division of work – data parallelismCopyright © 2010, Elsevier Inc. All

Division of work – task parallelismCopyright © 2010, Elsevier Inc. All

CoordinationCores usually need to coordinate their work.Communication – one or more

What we’ll be doingLearning to write programs that are explicitly parallel.Using

Type of parallel systemsShared-memoryThe cores can share access to the computer’s

Type of parallel systemsCopyright © 2010, Elsevier Inc. All rights ReservedShared-memoryDistributed-memory

Terminology Concurrent computing – a program is one in which multiple

Concluding Remarks (1)The laws of physics have brought us to the

Похожие презентации

Copyright © 2010, Elsevier Inc. All rights Reserved
Roadmap
Why we need ever-increasing

Changing times
Copyright © 2010, Elsevier Inc. All rights Reserved
From 1986 –

An intelligent solution
Copyright © 2010, Elsevier Inc. All rights Reserved
Instead of

Now it’s up to the programmers
Adding more processors doesn’t help much

Why we need ever-increasing performance
Computational power is increasing, but so are

Climate modeling
Copyright © 2010, Elsevier Inc. All rights Reserved

Protein folding
Copyright © 2010, Elsevier Inc. All rights Reserved

Drug discovery
Copyright © 2010, Elsevier Inc. All rights Reserved

Energy research
Copyright © 2010, Elsevier Inc. All rights Reserved

Data analysis
Copyright © 2010, Elsevier Inc. All rights Reserved

Why we’re building parallel systems
Up to now, performance increases have been

A little physics lesson
Smaller transistors = faster processors.
Faster processors = increased

Solution
Move away from single-core systems to multicore processors.
“core” = central

Why we need to write parallel programs
Running multiple instances of a

Approaches to the serial problem
Rewrite serial programs so that they’re parallel.
Write

More problems
Some coding constructs can be recognized by an automatic program

Example
Compute n values and add them together.
Serial solution:
Copyright © 2010, Elsevier

Example (cont.)
We have p cores, p much smaller than n.
Each core

Example (cont.)
After each core completes execution of the code, is a

Example (cont.)
Once all the cores are done computing their private my_sum,

Example (cont.)
Copyright © 2010, Elsevier Inc. All rights Reserved

Example (cont.)
Copyright © 2010, Elsevier Inc. All rights Reserved
Global sum
8 +

Copyright © 2010, Elsevier Inc. All rights Reserved
But wait!
There’s a much

Better parallel algorithm
Don’t make the master core do all the work.
Share

Better parallel algorithm (cont.)
Repeat the process now with only the evenly

Multiple cores forming a global sum
Copyright © 2010, Elsevier Inc. All

Analysis
In the first example, the master core performs 7 receives and

Analysis (cont.)
The difference is more dramatic with a larger number of

How do we write parallel programs?
Task parallelism
Partition various tasks carried

Professor P
Copyright © 2010, Elsevier Inc. All rights Reserved
15 questions
300 exams

Professor P’s grading assistants
Copyright © 2010, Elsevier Inc. All rights Reserved
TA#1
TA#2
TA#3

Division of work – data parallelism
Copyright © 2010, Elsevier Inc. All

Division of work – task parallelism
Copyright © 2010, Elsevier Inc. All

Division of work – data parallelism
Copyright © 2010, Elsevier Inc. All

Division of work – task parallelism
Copyright © 2010, Elsevier Inc. All

Coordination
Cores usually need to coordinate their work.
Communication – one or more

What we’ll be doing
Learning to write programs that are explicitly parallel.
Using

Type of parallel systems
Shared-memory
The cores can share access to the computer’s

Type of parallel systems
Copyright © 2010, Elsevier Inc. All rights Reserved
Shared-memory
Distributed-memory

Terminology
Concurrent computing – a program is one in which multiple

Concluding Remarks (1)
The laws of physics have brought us to the