Sunday, February 2, 2014

First Normal Form (1NF)

First normal form, how to normalize an unnormalized table into 1NF table, problem with non-atomic attributes, convert an UNF relation into 1NF relation


Prev - Normalization                                                                     Next - Functional Dependency (FD) 

First Normal Form (1NF) with example

Normal Forms

The idea is to organize the attributes in any tables to have reduced redundancy and dependency. In the process of Normalization we have the normal forms, First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), Boyce-Codd Normal Form (BCNF). Here 1NF, 2NF, and 3NF were defined by Edgar F.Codd and BCNF was defined by Raymond F.Boyce along with Codd. There are other normal forms like 4NF, 5NF, DKNF, and 6NF. Here, we are about to discuss first four normal forms.

In the normalization process of a table, we need to check whether a table satisfies the conditions of every normal form. If yes, we could say the given table is in that normal form. That is, if a table satisfies the conditions of 1NF, then we say that the table is in 1NF, and so on.

First Normal Form (1NF)


In Relational Database Management System (RDBMS), 1NF is one of the properties a relation (table) must satisfy. 1NF requires the following properties to be satisfied.

  •     Any attribute (column) of a table must have been designed to accept only atomic values.
  • Any value stored in any column must have single value (no repeating groups).


Let us take the following relation (table) STUDENT1 to explain the property 1NF. In STUDENT1, PR is Program Registered, and CName_Regd is Name of the course registered by the student.


RegNo
SName
Gen
PR
CName_Regd
R1
Sundar
M
BTech
Database, Data Structures
R2
Ram
M
MS
Database
R3
Kathik
M
MCA
Data Structures, Multimedia
R4
John
M
BSc
Multimedia
Table 1 – STUDENT1


Rule 1 says that the columns can have atomic domains only. That is, the permitted/accepted values for the column should be indivisible. Here indivisible means that if we try to divide a value stored in the column in any possible way, it should not give a meaningful string. Let us take the table STUDENT1. In this table, the CName_Regd attribute is accepts more than one course if registered by single student. And the value is inserted using comma as the separator for every individual course. Student “Sundar”, registered for two courses, namely, “Database”, and “Data Structures”. And they are stored as a single value (string) “Database, Data Structures”. If we divide the value at the comma, we will get two meaningful names “Database”, and “Data Structures”. That is, both are valid course names. Hence, CName_Regd is divisible.

The domain of Gen (Gender) is indivisible, i.e., atomic. We cannot divide the value stored in the column further. RegNo values cannot be divided further; PR values cannot be divided further, and so on.

Rule 2 insists that the column must not have multiple values. In our example, CName_Regd column accepts set of courses (i.e., group of one or more values) registered by a student for every record. And, if more than one course is registered, the course names are stored as single value separated by a comma (or any separators as decided by the designer). Like said above, the value is divisible.
Our table STUDENT1 does not satisfy both the conditions. Hence, it is not in 1NF. (If any one of the rules were not satisfied, then also the table not in 1NF). To further normalize the table into 2NF, 3NF, and so on, we need to convert this design into 1NF.

Solution to convert into 1NF:


To convert non-1NF design into 1NF, “unnest” (i.e., flatten the relation) the given relation.

In STUDENT1, CName_Regd is the only attribute which violates 1NF. So, it can be unnested. That is, for every individual course of any student, repeat the content of all the other column values for that student as given in the table STUDENT_1NF below.


RegNo
SName
Gen
PR
CName_Regd
R1
Sundar
M
BTech
Database
R1
Sundar
M
BTech
Data Structures
R2
Ram
M
MS
Database
R3
Kathik
M
MCA
Data Structures
R3
Kathik
M
MCA
Multimedia
R4
John
M
BSc
Multimedia
Table 2 – STUDENT_1NF


Note that, the course “Data Structures” has two substrings. If we try to divide “Data Structures” further like “Data” and “Structures”, this division, even though give meaningful strings “Data” and “Structures”, these are not courses offered.
Hence, this kind of division does not mean ‘divisible attribute’.


What would happen if any attribute is divisible?

Let us table Table1 to explain this. For queries which require list of courses for a given register number is straight forward. That is, we get the list of course names just by using the register number in the WHERE clause of the query. But, when the requirement is, “Find the list of student register numbers whoever registered for the course Database”, it is a complex scanning of the attribute for the result. Because, it involves parsing of every value present in the CName_Regd column and it is not simple.

When we normalize the table, both the requirements can be handled easily without any string analysis.

For other simple definition on First Normal Form (1NF), click here.

Prev - Normalization                                                                     Next - Functional Dependency (FD)


*********

Related Links




First normal form 1NF in DBMS with example



 

2 comments:

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents