Pipelined Parallelism and Independent Parallelism / Types of Interoperation Parallelism / What is Pipelined Parallelism? / What is Independent Parallelism?
Interoperation Parallelism
It is about executing different
operations of a query in parallel. A single query may involve multiple operations
at once. We may exploit parallelism to achieve better performance of such
queries. Consider the example query given below;
SELECT
AVG(Salary) FROM Employee GROUP BY Dept_Id;
It involves two operations. First
one is an Aggregation and the second is grouping. For executing this query,
We need to
group all the employee records based on the attribute Dept_Id first.
Then, for
every group we can apply the AVG aggregate function to get the final result.
We can use Interoperation
parallelism concept to parallelize these two operations.
[Note: Intra-operation is about
executing single operation of a query using multiple processors in parallel]
The following are the variants
using which we would achieve Interoperation Parallelism;
1. Pipelined
Parallelism
2. Independent
Parallelism
1. Pipelined Parallelism
In Pipelined Parallelism, the
idea is to consume the result produced by one operation by the next operation
in the pipeline. For example, consider the following operation;
r1 ⋈ r2 ⋈ r3 ⋈ r4
The above expression shows a
natural join operation. This actually joins four tables. This operation can be
pipelined as follows;
Perform temp1 ← r1 ⋈ r2
at processor P1 and send the result temp1 to processor P2 to perform temp2 ← temp1 ⋈ r3
and send the result temp2 to processor P3 to perform result ← temp2 ⋈ r4.
The advantage is, we do not need to store the intermediate results, and instead
the result produced at one processor can be consumed directly by the other. Hence,
we would start receiving tuples well before P1 completes the join assigned to
it.
Disadvantages:
1. Pipelined parallelism is not
the good choice, if degree of parallelism is high.
2. Useful with small number of
processors.
3. Not all operations can be
pipelined. For example, consider the query given in the first section. Here,
you need to group at least one department employees. Then only the output can
be given for aggregate operation at the next processor.
4. Cannot expect full speedup.
2. Independent Parallelism:
Operations that are not depending
on each other can be executed in parallel at different processors. This is
called as Independent Parallelism.
For example, in the expression r1
⋈ r2 ⋈ r3 ⋈ r4, the portion r1 ⋈ r2 can be done in
one processor, and r3 ⋈
r4 can be performed in the other processor. Both results can be pipelined into
the third processor to get the final result.
Disadvantages:
Does not work well in case of
high degree of parallelism.