Query Processing in DBMS / Steps involved in Query Processing in DBMS / How is a query gets processed in a Database Management System? / Query Processing overview / Database Query Processing
Query Processing
Query Processing would mean the
entire process or activity which involves query translation into low level
instructions, query optimization to save resources, cost estimation or
evaluation of query, and extraction of data from the database.
Goal: To find an efficient Query
Execution Plan for a given SQL query which would minimize the cost
considerably, especially time.
Cost Factors: Disk accesses
[which typically consumes time], read/write operations [which typically needs
resources such as memory/RAM].
The major steps involved in query
processing are depicted in the figure below;
Figure 1 - Steps in Database Query Processing |
Let us discuss the whole process
with an example. Let us consider the following two relations as the example
tables for our discussion;
Employee(Eno,
Ename, Phone)
Proj_Assigned(Eno,
Proj_No, Role, DOP)
where,
Eno is
Employee number,
Ename is
Employee name,
Proj_No is
Project Number in which an employee is assigned,
Role is the
role of an employee in a project,
DOP is
duration of the project in months.
With this information, let us
write a query to find the list of all employees who are working in a project
which is more than 10 months old.
SELECT
Ename
FROM
Employee, Proj_Assigned
WHERE
Employee.Eno = Proj_Assigned.Eno AND DOP > 10;
Input:
A query written in SQL is given
as input to the query processor. For our case, let us consider the SQL query
written above.
Step 1: Parsing
In this step, the parser of the
query processor module checks the syntax of the query, the user’s privileges to
execute the query, the table names and attribute names, etc. The correct table
names, attribute names and the privilege of the users can be taken from the
system catalog (data dictionary).
Step 2: Translation
If we have written a valid query,
then it is converted from high level language SQL to low level instruction in
Relational Algebra.
For example, our SQL query can be
converted into a Relational Algebra equivalent as follows;
πEname(σDOP>10
Λ Employee.Eno=Proj_Assigned.Eno(Employee X Prof_Assigned))
Step 3: Optimizer
Optimizer uses the statistical
data stored as part of data dictionary. The statistical data are information
about the size of the table, the length of records, the indexes created on the
table, etc. Optimizer also checks for the conditions and conditional attributes
which are parts of the query.
Step 4: Execution Plan
A query can be expressed in many
ways. The query processor module, at this stage, using the information
collected in step 3 to find different relational algebra expressions that are equivalent
and return the result of the one which we have written already.
For our example, the query
written in Relational algebra can also be written as the one given below;
πEname(Employee
⋈Eno (σDOP>10 (Prof_Assigned)))
So far, we have got two execution
plans. Only condition is that both plans should give the same result.
Step 5: Evaluation
Though we got many execution
plans constructed through statistical data, though they return same result
(obvious), they differ in terms of Time consumption to execute the query, or
the Space required executing the query. Hence, it is mandatory choose one plan
which obviously consumes less cost.
At this stage, we choose one
execution plan of the several we have developed. This Execution plan accesses
data from the database to give the final result.
In our example, the second plan
may be good. In the first plan, we join two relations (costly operation) then
apply the condition (conditions are considered as filters) on the joined
relation. This consumes more time as well as space.
In the second plan, we filter one
of the tables (Proj_Assigned) and the result is joined with the Employee table.
This join may need to compare less number of records. Hence, the second plan is
the best (with the information known, not always).
Output:
The final result is shown to the
user.
The overall information discussed above are depicted in Figure 2 below;
Figure 2 - Query Processing [Note: in Step 4, NJ means Natural Join] |
Simple & straight. Thank you for article
ReplyDeletethis a nice article
DeleteThanks
ReplyDeleteThanks a lot
ReplyDeleteThank you so much I got a 2/10 in a cat and a ten in my exam...kudos
ReplyDeleteGood answer
ReplyDeletegreat article.
ReplyDeleteinfix to postfix conversion
best one keep going
ReplyDeleteGood
ReplyDeleteGood
ReplyDeleteReally helpful
ReplyDelete