Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
We have already learned about the Second Normal Form in great detail. The next step in database normalization is 3NF. Two conditions for a table to be in the third normal form are:
First, be in the second normal form(2NF).
Second, it does not contain any transitive dependency, which we will learn about in detail in this article.
The First Normal Form eliminates repeating groups, and the Second Normal Form reduces redundancy by eliminating the partial dependency. The Third Normal Form reduces data duplication to the least while achieving data integrity. First, let us learn about the terms used while studying the Third Normal Form.
Key terms required to understand 3NF
Superkey: A super key is an attribute or set that uniquely identifies data items in a table. Superkey attributes can contain null values too, which is not the case for a primary key.
Prime-attributes: We could say that the primary key is a minimal super key as it is used to uniquely identify attributes and any of the minimal candidate keys can be chosen as a primary key.
Trivial functional dependency: A dependent is a subset of the determinant. Eg: X → Y, meaning that Y is a subset of X.
Non-trivial functional dependency: Here, the dependent is strictly not a subset of the determinant. This means that if X → Y, and Y isn’t a subset of X, then it will be a non-trivial functional dependency. We will take a table of Employeeswith the attributes empID, empName and experience to illustrate non-trivial functional dependency with an example:
empID
empName
experience
107
Emmy
10
219
Robert
7
302
Jeffry
14
167
Angelina
21
The dependent here is empName and the determinant is empID. In the above relation, empID→ empName will have a non-trivial functional dependency because the dependent empName is strictly not a subset of the determinant empID. In the same way, {empID, empName} → experience is also a non-trivial functional dependency, since experience is not a subset of {empID, empName}.
Transitive functional dependency: When a dependent is indirectly dependent on the determinant, it is a transitive functional dependency. This means that if a → b and b → c, then a → c.
We primarily need to learn about transitive dependency for the Third Normal Form. Let us discuss it in detail with an example:
Understanding Transitive Dependency
When in a relation, a non-prime attribute is dependent on another non-prime attribute instead of the primary key, the relation has a transitive dependency. Let us try to understand this better with an illustrative example.
Take an example for a relation containing examination data with the columns seatNo, name, totalMarks and pass_fail:
seatNo
name
totalMarks
pass_fail
107
Emmy
83
pass
219
Robert
30
fail
302
Jeffry
46
pass
167
Angelina
79
pass
For the above table:
The candidate key would be seatlNo.
All other attributes in the table are non-prime attributes.
Super key: {seatNo}, {seatNo, name}, {seatNo, name, totalMarks} and so on.
We can see that pass_fail depends on totalMarks which depends on seatNo. For the former case, totalMarks is a non-prime attribute that pass_fail depends on. This means that pass_fail transitively depends on seatNo.
This violates the third normal form. We will need to follow a few steps to prevent transitive dependency. Before, let's discuss how to Identify a table whether it is in 3NF or not:
Identifying tables in Third Normal Form (3NF)
For a table to be in 3NF, they must have
1. No partial dependency and
2. No transitive dependency.
A simpler method to confirm the third normal form would be either ensuring that for A→B, either A is super key or B is a prime attribute.
Let us assume there’s a relation R with the attributes A, B, C, D, E.
There are three dependencies in R(A, B, C, D, E) that are A→B, B→E and C→D.
The first step towards identifying a relation in the third normal form is identifying the candidate keys. The attributes not dependent on any columns will be considered as the candidate key as a whole and these attributes are A and C.
A can help us determine B, C helps determine D and B determines E. therefore, (AC)=ABCDE Owing to this, we can confirm that AC is our candidate key that helps identify all other non-prime attributes.
Once we identify the candidate keys, we need to see if the table is in 3NF. In a relation where the functional dependency A→B exists, either A must be a super key or B must be a prime attribute for the table to be in the third normal form.
Let’s consider the first dependency A→B. Here, neither A is super key, nor is B a prime attribute. Therefore, this relation is definitely not in the third normal form.
Considering another example with the table given below, our primary key is (CustID, Item) because all other transitive dependencies would be eliminated as they help identify all non-prime attributes in the table. The relation must have no transitive dependency, meaning there must be only one primary key that could help determine all the other attributes in the table.
In the following example, all non-primary columns are dependent on the primary key.
Let us look at our very first example now, the highlighted columns are primary keys:
seatNo
name
totalMarks
pass_fail
107
Emmy
83
pass
219
Robert
30
fail
302
Jeffry
46
pass
167
Angelina
79
pass
To convert the above-shown table to Third Normal Form, we will need to rearrange data in separate tables to remove the Transitive Dependency: So, we will first create a different table for seatNo, name and totalMarks where the seatNo will be the primary key, whereas, in a second table, we will have pass_fail with totalMarks as the primary key.
seatNo
name
totalMarks
03
metropolis
Brooke
12
suburb
Spectre
07
country
Paul
02
metro
Danny
totalMarks
pass_fail
83
pass
30
fail
46
pass
79
pass
We can now link the tables suitably and ensure that our relation has no transitive dependency, meaning we have finally accomplished the final step of normalization-the third normal form.
Therefore, the third normal form would require decomposing if the table was previously in 1NF or 2NF. Tables in the third normal form eliminate almost all redundancy in a table. Tables in 3NF are free of anomalies in insertion, deletion or updation. Following data integrity, relations also provide the most flexibility and efficiency possible within a relation in the DBMS.
Frequently Asked Questions
What is meant by DBMS normalization? Normalization in databases refers to organising data and involves creating relations and establishing links between them, and keeping in mind efficiency, protection and flexibility in the relations.
Why is the Third Normal Form the best normalization? Almost every database designer tries to reach 3NF in database normalization because relations in the third normal form have the least redundancy, the most flexibility and efficiency while keeping the data integrity and ensuring the data is protected.
What is meant by transitive dependency? When a non-prime attribute depends on another non-prime attribute instead of the primary key, or in other words when a dependent is indirectly dependent on the determinant, the relation is said to have a transitive functional dependency. We can write this as follows: If a → b and b → c, then a → c.
What are the conditions for 3NF? For a table or a relation to be in the third normal form, it must first be in the second normal form and must not have any transitive dependencies.
Key takeaways
In this article, we learned about the third normal form and how it is implemented to make tables or relations ideally efficient. We also used examples to understand the concept better and learn to use our knowledge of the same. Keeping in mind the understanding of 3NF, we will further study the Boyce-Codd normal form.