First normal form, how to normalize an unnormalized table into 1NF table, problem with non-atomic attributes, convert an UNF relation into 1NF relation
First Normal Form (1NF) with example
Normal Forms
The idea is to organize the
attributes in any tables to have reduced redundancy and dependency. In the
process of Normalization we have the normal forms, First Normal Form (1NF),
Second Normal Form (2NF), Third Normal Form (3NF), Boyce-Codd Normal Form
(BCNF). Here 1NF, 2NF, and 3NF were defined by Edgar F.Codd and BCNF was
defined by Raymond F.Boyce along with Codd. There are other normal forms like
4NF, 5NF, DKNF, and 6NF. Here, we are about to discuss first four normal forms.
In the normalization process of a
table, we need to check whether a table satisfies the conditions of every
normal form. If yes, we could say the given table is in that normal form. That
is, if a table satisfies the conditions of 1NF, then we say that the table is
in 1NF, and so on.
First Normal Form (1NF)
In Relational Database Management
System (RDBMS), 1NF is one of the properties a relation (table) must satisfy.
1NF requires the following properties to be satisfied.
- Any attribute (column) of a table must have been designed to accept only atomic values.
- Any value stored in any column must have single value (no repeating groups).
Let us take the following
relation (table) STUDENT1 to explain the property 1NF. In STUDENT1, PR is
Program Registered, and CName_Regd is Name of the course registered by the
student.
RegNo
|
SName
|
Gen
|
PR
|
CName_Regd
|
R1
|
Sundar
|
M
|
BTech
|
Database, Data Structures
|
R2
|
Ram
|
M
|
MS
|
Database
|
R3
|
Kathik
|
M
|
MCA
|
Data Structures, Multimedia
|
R4
|
John
|
M
|
BSc
|
Multimedia
|
Table 1 – STUDENT1
Rule 1 says that the columns can have atomic
domains only. That is, the permitted/accepted values for the column should
be indivisible. Here indivisible means that if we try to divide a value stored
in the column in any possible way, it should not give a meaningful string. Let
us take the table STUDENT1. In this table, the CName_Regd attribute is accepts
more than one course if registered by single student. And the value is inserted
using comma as the separator for every individual course. Student “Sundar”, registered
for two courses, namely, “Database”, and “Data Structures”. And they are stored
as a single value (string) “Database, Data Structures”. If we divide the value
at the comma, we will get two meaningful names “Database”, and “Data Structures”.
That is, both are valid course names. Hence, CName_Regd is divisible.
The domain of Gen (Gender) is
indivisible, i.e., atomic. We cannot divide the value stored in the column
further. RegNo values cannot be divided further; PR values cannot be divided
further, and so on.
Rule 2 insists that the column must not have
multiple values. In our example, CName_Regd column accepts set of courses
(i.e., group of one or more values) registered by a student for every record.
And, if more than one course is registered, the course names are stored as
single value separated by a comma (or any separators as decided by the
designer). Like said above, the value is divisible.
Our table STUDENT1 does not
satisfy both the conditions. Hence, it is not in 1NF. (If any one of the rules were
not satisfied, then also the table not in 1NF). To further normalize the table
into 2NF, 3NF, and so on, we need to convert this design into 1NF.
Solution to convert into 1NF:
To convert non-1NF design into
1NF, “unnest” (i.e., flatten the relation) the given relation.
In STUDENT1, CName_Regd is the
only attribute which violates 1NF. So, it can be unnested. That is, for every
individual course of any student, repeat the content of all the other column values
for that student as given in the table STUDENT_1NF below.
RegNo
|
SName
|
Gen
|
PR
|
CName_Regd
|
R1
|
Sundar
|
M
|
BTech
|
Database
|
R1
|
Sundar
|
M
|
BTech
|
Data Structures
|
R2
|
Ram
|
M
|
MS
|
Database
|
R3
|
Kathik
|
M
|
MCA
|
Data Structures
|
R3
|
Kathik
|
M
|
MCA
|
Multimedia
|
R4
|
John
|
M
|
BSc
|
Multimedia
|
Table 2 – STUDENT_1NF
Note that, the course “Data Structures” has two substrings. If we try
to divide “Data Structures” further like “Data” and “Structures”, this
division, even though give meaningful strings “Data” and “Structures”, these
are not courses offered.
Hence, this kind of division does not mean ‘divisible attribute’.
Hence, this kind of division does not mean ‘divisible attribute’.
What would happen if any attribute is divisible?
Let us table Table1 to explain
this. For queries which require list of courses for a given register number is
straight forward. That is, we get the list of course names just by using the
register number in the WHERE clause of the query. But, when the requirement is,
“Find the list of student register
numbers whoever registered for the course Database”, it is a complex
scanning of the attribute for the result. Because, it involves parsing of every
value present in the CName_Regd column and it is not simple.
When we normalize the table, both
the requirements can be handled easily without any string analysis.
For other simple definition on First Normal Form (1NF), click here.
Prev - Normalization Next - Functional Dependency (FD)
*********
Related Links
- Go to Normal Form - Home page
- Go to Normalization - Home page
- Go to Normalization Solved Exercises - Home page
- Go to Comparison of All Normal Forms page
I really enjoyed this! Thanks, also see this one: first normal form example
ReplyDeleteThanks , Great Tutorials
ReplyDelete