Monday 7 June 2010

The basics of parallel computing

From here

Basics of parallelism:
To understand parallelism, first you must understand serialism. A serial process is one that starts from the beginning and follows its instructions until the end. A parallel process breaks up the instructions and spreads them out onto multiple processing units and thereby speeds up the execution of the process by adding cost to the equation, both costs in equipment and in parallel overhead (the extra control functions that manage the process). A parallel process will often (most of the time) include serial sections of instructions.

A simple way of understanding a serial process is to think of a librarian putting away 100 books that have just been returned to the library. If the time it takes to shelve one book is called one unit of time then it will take one librarian 100 units of time to shelve all the books.

To turn this into a parallel process we add 9 additional librarians and split the 100 books between the 10 librarians each librarian will be required to shelve 10 books, taking 10 units of time each (concurrently so it only takes 10 units of real time). But there is some over head in additional salaries and the time it takes to split the books between the librarians and for the librarians to report that the task is done. So it takes a little more than 10 units of time for the 10 librarians to perform the same task as the single librarian.

What is speed up and how do we measure it?
Speed up is the factor that a process in sped up by making it parallel. The equation is fairly simple: (old speed)/(new speed).

So for our librarian problem the old speed is 100 and the new speed is 10 (ignoring parallel overhead)

100/10 = 10 so the speed up of adding 9 librarians is 10.

Are there limits to what the speed up can be?
Yes, there are hardware limitations on how fast a process can be spread and there is a level at which adding more parallel capability is redundant.

If we increase the number of librarians to 100 we still see a speed up as the total time taken by the librarians will be 1

100/1 = 100 speed up

However is we have 1000 librarians we will see no additional speed up as the time taken will still be 1

100/1 = 100 speed up

So in this instance the additional 900 librarians will be sitting idle, however an additional 900 books could be added to the processing queue with out the process taking any additional time.

What kinds of parallel computing are there?
There are 2 variables with 2 values to concider according to Flynn's Classical Taxonomy.
Number of instructions: single or many
Data: single or many

So the four types of parallel computing are:
Single Instruction, Single Data: In this senario a single set of instructions are processed on the same data on multiple cores.

Single Instruction, Multiple Data: This senario has the same set of instructions workingon more than one set of data

Multiple Instruction, Single Data: This senario has multiple instructions working on a single dataset.

Multiple Instruction, Multiple Data: This senario has multiple processors executing instructions on different data sets. This is the mostly likely form of parallel processing for this project.

Important terms:
Task: A series of instructions

Parallel Task: A series of instructions that are safe to run in parallel.

Serial Execution: A set of instructions that must be runone after another.

Parallel Execution: A set of instructions that can be run on more than one processor at a time.

Pipelining: The process by which a set of instructions is broken up into discrete parts in order to be processed in parallel.

Shared Memory: Memory shared by all parallel processes.

Symetric Multiprocessing (SMP): A form of parallelism that multiple processors share the same address and memory and work on the same problem at the same time (using shared memory).

Distributed Memory: Memory spread accross a network. A process must use the network to see any non-local memory.

Communications: Typically a process will have to communicate with other proceses, also known as Inter Process Communication (IPC)

Syncronization: The process by with some threads are delayed in order for others to catch up (end of process or for communication)

Granularity: How much process is going on vs how much communication is going on.

Scalability: How the process responds to additional processor resources.

No comments:

Post a Comment