Tuesday, February 11, 2014

Super Key, Candidate Key, and Primary Key

Define Super key, Candidate key, Primary key and Foreign key / Type of keys in database / Purpose of Super key, Candidate key, Primary key and Foreign key in database design / Super key, Candidate key, Primary key and Foreign key Examples / How does a Foreign key ensure consistency?

Super key, Candidate key, and Primary key - Definitions and Discussions


Emp_ID
Emp_Name
DOB
Gender
Dept_No
E101
Ramkumar
15-JUL-1986
M
2
E103
Ramesh
04-MAY-1989
M
1
E104
Stephen
29-OCT-1987
M
1
E102
Nirmal
23-JAN-1980
M
3
E105
Laxmi
20-MAY-1988
F
4
E107
Rani
23-JAN-1980
F
4
E106
Ramesh
12-MAR-1979
M
2


Table 1 - Employee

Keys

 

The ability to locate data uniquely is one of the main properties which must be included while designing any databases. That is, using any single or set of attribute values one must be able to locate data in the database uniquely. The attribute(s) is called as key for a relation (table).
To retrieve unique(single) record (entity) from table Employee, we can write the following SQL query;
SELECT * FROM employee WHERE emp_id = ‘E102’;
This query can get you information about employee ‘Nirmal’, because he is the only person with the Emp_ID value ‘E102’. And, if you look at the sample data stored, it is very evident that no two employees have same Emp_ID. Hence, Emp_ID attribute (column) is one of the keys for the table Employee.

Types of Keys

1. Superkey


Set of one or more attributes together can uniquely identify rest of the attributes of a relation uniquely is called Superkey.

In our example, Emp_ID can uniquely identify any records in Employee. Likewise, any of the combinations, (Emp_ID, Emp_Name), (Emp_ID, DOB), (Emp_ID, Gender), (Emp_ID, Dept_No), (Emp_Name, DOB), (Emp_Name, DOB, Gender), and so on [you form the other combinations which can uniquely identify records] can also identify the records uniquely.

For any table, we have at least one super key which is the combination of all the attributes of the table. This is trivial superkey. In RDBMS, no two records are allowed to be the same. For example, at least the complete single record (E101, Ramkumar,         15-JUL-1986, M, 2) is different from (E105, Laxmi, 20-MAY-1988, F, 4). Hence, we have at least one super key.

Note: Redundant attributes are permitted in Super Keys. In (Emp_ID, Emp_Name), Emp_Name is unnecessary and redundant.

2. Candidate Key

Minimal Super key is a key which is a super key without any redundant attribute (unnecessary combination). In other words, if you remove any attribute from a key combination, it should not be able to uniquely identify data. That is we need a minimal combination of attributes. For example, (Emp_ID, Emp_Name) is a super key with an unnecessary attribute Emp_Name. Here, without Emp_Name attribute, only Emp_ID attribute can uniquely identify records. Hence, Emp_ID is itself a minimal super key.

Candidate key is such a minimal super key with another condition as follows;
Minimal super key with no component in its proper subset is a super key.
[Proper subset – if A is subset of B and A is not equal to B then A is the proper subset of B]

For example, {(Emp_ID), (Emp_Name)} is proper subset of set (Emp_ID, Emp_Name). In this, (Emp_ID) is itself a super key. Hence, (Emp_ID, Emp_Name) is not a candidate key but a super key.
As another example, (Emp_Name, DOB) is a super key and it is minimal. That is, the proper subset {(Emp_Name), (DOB)} does not contain any super keys. So, (Emp_Name, DOB) is a candidate key.

For our table Employee’s instance (information stored at a particular moment in a table is called instance), we have the following candidate keys (please check them);

(Emp_ID)
(Emp_Name, DOB)
(Emp_Name, Dept_No)
(DOB, Dept_No)
(DOB, Gender)

All the listed candidate keys are minimal. If you divide them further, they lose the property of a key. That is, for example, Dept_No alone cannot uniquely identify.



Why do we need to identify set of candidate keys?


They are helping in normalizing a table. That is, they are helping in eliminating unwanted anomalies. More on Boyce Codd Normal Form (BCNF).

3. Primary Key


Any relation (table) in Relational Database Management Systems (RDBMS) must have a primary key, which is one of the candidate keys which are minimal. That is, we choose one key among the list of candidate keys as primary key for a table. While choosing the primary key we are looking for simple candidate key among all the candidate keys. For example, in the above list of candidate keys, we choose (Emp_ID) as primary key because it is simple and single attribute key.

What is the purpose of Primary Key?


As I said earlier, it helps us in identifying data uniquely in the database. Also, it helps in normalizing your table and helps in performance boosting especially when you access your data using primary key.



4. Foreign Key


Foreign key is an attribute (or set of attributes) of one table (relation) which refers its value from another table’s Primary key (preferably) to establish a connection between them. In other words, it can be stated as “It is a constraint which helps in maintaining consistency on an attribute (or set of attributes) of two tables”.

Example of Foreign Key


Consider the following two relational schemas (Primary keys underlined).

STUDENT (Regno, SName, Program, DOJ, DOB)
STU_PHONE (Regno, Phone)

Here, Regno is the key for STUDENT and (Regno, Phone) is the key for STU_PHONE. Construction of separate table for storing Phone numbers is due to the intent to store multiple phone numbers for single student. That is, this design enables us to store zero or more phone numbers. It is clear that, if we would like to store any phone numbers, the condition is that the phone number must belong to some student. In other words, if you like to enter any phone number, then there must be a student registered for any program. Based on this condition, we designate Regno of STU_PHONE as the Foreign Key which refers the value of any existing STUDENT records’ Regno.

How does a Foreign Key ensure consistency?


Foreign key shows Parent-Child relationship. That is, there must be a parent (tuples in STUDENT) record for every child record (tuples in STU_PHONE). Hence, it acts as a condition.

And, it ensures consistency through cross verifying the values entered in the child table against the values stored in the parent table, thereby gives security from mistyping some values.

In its implementation of Foreign Key, if anybody entering a record in STU_PHONE, the Regno column value will be verified with the Regno column of STUDENT table for the existence of such values.

For example, if you like to enter a record (‘12MKV001’, 9897909411) in STU_PHONE, then you must have a record in STUDENT with Regno value ‘12MKV001’

Conclusion:


A table can have at most one Primary key.

Every Primary key is one of the candidate keys.
Every Primary key is a super key. Not all the super keys are primary key.
All the candidate keys except Primary key are called Alternate keys.
Primary key which consists of more than one attribute is called Composite key.
In any table, only one Primary key is permitted.


Foreign keys need not always refer to primary keys. it needs some attributes which are UNIQUE.

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents