Pattern Name: SPMD |
SupportingStructures Design space |
This document is under construction.
Some parallel algorithms can be effectively implemented in terms of a number of units of execution all executing the same program, each on its "own" data -- hence the name, "Single Program, Multiple Data". The SPMD pattern describes this situation.
As noted above, it is fairly common to structure parallel algorithms in terms of a number of UEs that are identical with respect to code but different with respect to data. This pattern describes how to set up such a structure in terms of the constructs of the underlying computation space.
Use the SPMD pattern when the situation described above holds. This pattern is particularly suited to problems in which the key decomposition is data decomposition.
A key issue in applying this pattern is partitioning the original problem's data into "common data" and N sets of "process data", but this issue must usually be resolved in the context in which the pattern is to be applied and so is not discussed here but rather in the problem-space patterns that use this pattern.
Another key issue in applying this pattern is specifying the way in which the processes interact (communication, synchronization, access to common data, etc.), but again this issue must usually be resolved in context.
The key elements of this pattern are:
Using a C-style functional notation, we could denote the "code to execute" part of this pattern as f(c, p), where c is the common data, and p is a set of process data. The pattern is then to consist of N processes executing f(c, p1), f(c, p2), ... f(c, pN). This pattern makes no restrictions on how the N processes can interact (making sure they interact in a way that accomplishes the desired end is a problem to be addressed at a higher level). The pattern as a whole is finished when all N of the processes finish.
This pattern is used exclusively to structure a parallel program at the top level; unlike ForkJoin, it does not occur as an element in a sequential composition.
For some parallel environments (e.g., some multi-address-space/message-passing environments) this pattern is the only way of structuring a parallel program. For others (e.g., a multiprocessing environment) it is one of several alternatives.
This pattern is obviously best suited to problems in which the key decomposition is a data decomposition, though it can be made to work for task decompositions as well (by structuring the code-to-execute-in-each-process as a big case-statement block).
It is also most suitable when the cost of process creation is large (and hence one wishes to avoid repeatedly creating and destroying processes).
To be added later.
To be added later.
This pattern has many uses, for example in implementations of the EmbarrassinglyParallel and GeometricDecomposition patterns.
To be added later.