Memory required for external sorting in DBMS
Question:
Let us suppose that you have a file
with 150000 pages and 45 available buffer pages. If you use external sorting to
sort the above said file, answer the following questions;
a) How many runs will you produce in
the first pass?
b) To sort the entire file, how many
passes do you need?
c) How much is involved in terms of I/O
cost to sort the file?
d) If you like to sort the file in
just two passes, then what is the number of buffer pages you need?
Solution:
a) If Nf is the number of file pages
and Nb is the number of available buffer pages then the first pass requires ⌈Nf/Nb⌉ [ceiling(Nf/Nb)]
sorted runs.
For our case,
Number of file pages, Nf = 150000
Number of available buffer pages, Nb
= 45
The number of sorted runs in the first
pass = ⌈150000/45⌉
= 3334 runs
b) The number of passes required to
sort the entire file can be calculated using the equation ⌈logNb-1⌈Nf/Nb⌉⌉+1.
Number of passes to sort complete file
= ⌈log2⌈150000/45⌉⌉+1
=
⌈log23334⌉+1 = 13 passes
c)
Since each page is read and written once per pass, the total number of page
I/Os for sorting the file is 2 * Nf * (#passes):
Total
I/O cost for sorting the file = 2 * 150000
* 13 = 39,00,000
d) In Pass 0, ⌈Nf/Nb⌉ runs
are produced. In Pass 1, we must be able to merge this many runs; i.e., Nb − 1 ≥ ⌈Nf/Nb⌉.
This implies that Nb must
at least be large enough to satisfy Nb
* (Nb − 1) ≥ Nf; this can be used to
guess at Nb, and the
guess must be validated by checking the first inequality. Thus;
With 150000 pages in the file, Nb = 388 satisfies both
inequalities, but Nb = 387
does not. So we need 388 buffer pages to sort the entire file in just two
passes.
[Note:
to find the value of Nb, factorize the equation Nb * (Nb-1)
= Nf which can be written as a quadratic equation Nb2-Nb-Nf
= 0. The equation to be solved is Nb2 - Nb -
8500 = 0]
*********
Go to Solved Exercises in DBMS page
buffer pages required for external sorting
number of passes required to complete sorting of a file using external sorting
external sorting technique in database
number of runs required for each pass in external sorting
I/O cost involved in sorting the entire file
No comments:
Post a Comment