Leveraging ChatGPT - GenAI as a Microsoft Data Expert

Speaker

Prerita Agarwal

Data Specialist @

23 Jul, 2024 @ 01:30 PM

Introduction

We have already learned about the Second Normal Form in great detail. The next step in database normalization is 3NF. Two conditions for a table to be in the third normal form are:

First, be in the second normal form(2NF).

Second, it does not contain any transitive dependency, which we will learn about in detail in this article.

The First Normal Form eliminates repeating groups, and the Second Normal Form reduces redundancy by eliminating the partial dependency. The Third Normal Form reduces data duplication to the least while achieving data integrity. First, let us learn about the terms used while studying the Third Normal Form.

Key terms required to understand 3NF

Superkey: A super key is an attribute or set that uniquely identifies data items in a table. Superkey attributes can contain null values too, which is not the case for a primary key.

Prime-attributes: We could say that the primary key is a minimal super key as it is used to uniquely identify attributes and any of the minimal candidate keys can be chosen as a primary key.

Trivial functional dependency: A dependent is a subset of the determinant. Eg: X â†’ Y, meaning that Y is a subset of X.

Non-trivial functional dependency: Here, the dependent is strictly not a subset of the determinant. This means that if X â†’ Y, and Y isnâ€™t a subset of X, then it will be a non-trivial functional dependency. We will take a table of Employeeswith the attributes empID, empName and experience to illustrate non-trivial functional dependency with an example:

empID

empName

experience

107

Emmy

10

219

Robert

7

302

Jeffry

14

167

Angelina

21

The dependent here is empName and the determinant is empID. In the above relation, empIDâ†’ empName will have a non-trivial functional dependency because the dependent empName is strictly not a subset of the determinant empID. In the same way, {empID, empName} â†’ experience is also a non-trivial functional dependency, since experience is not a subset of {empID, empName}.

Transitive functional dependency: When a dependent is indirectly dependent on the determinant, it is a transitive functional dependency. This means that if a â†’ b and b â†’ c, then a â†’ c.

We primarily need to learn about transitive dependency for the Third Normal Form. Let us discuss it in detail with an example:

Get the tech career you deserve, faster!

Connect with our expert counsellors to understand how to hack your way to success

User rating 4.7/5

1:1 doubt support

95% placement record

Akash Pal

Senior Software Engineer

326% Hike After Job Bootcamp

Himanshu Gusain

Programmer Analyst

32 LPA After Job Bootcamp

After Job Bootcamp

Understanding Transitive Dependency

When in a relation, a non-prime attribute is dependent on another non-prime attribute instead of the primary key, the relation has a transitive dependency. Let us try to understand this better with an illustrative example.

Take an example for a relation containing examination data with the columns seatNo, name, totalMarks and pass_fail:

seatNo

name

totalMarks

pass_fail

107

Emmy

83

pass

219

Robert

30

fail

302

Jeffry

46

pass

167

Angelina

79

pass

For the above table:

The candidate key would be seatlNo.

All other attributes in the table are non-prime attributes.

Super key: {seatNo}, {seatNo, name}, {seatNo, name, totalMarks} and so on.

We can see that pass_fail depends on totalMarks which depends on seatNo. For the former case, totalMarks is a non-prime attribute that pass_fail depends on. This means that pass_fail transitively depends on seatNo.

This violates the third normal form. We will need to follow a few steps to prevent transitive dependency. Before, let's discuss how to Identify a table whether it is in 3NF or not:

Identifying tables in Third Normal Form (3NF)

For a table to be in 3NF, they must have

1. No partial dependency and

2. No transitive dependency.

A simpler method to confirm the third normal form would be either ensuring that for Aâ†’B, either A is super key or B is a prime attribute.

Let us assume thereâ€™s a relation R with the attributes A, B, C, D, E.

There are three dependencies in R(A, B, C, D, E) that are Aâ†’B, Bâ†’E and Câ†’D.

The first step towards identifying a relation in the third normal form is identifying the candidate keys. The attributes not dependent on any columns will be considered as the candidate key as a whole and these attributes are A and C.

A can help us determine B, C helps determine D and B determines E. therefore, (AC)=ABCDE Owing to this, we can confirm that AC is our candidate key that helps identify all other non-prime attributes.

Once we identify the candidate keys, we need to see if the table is in 3NF. In a relation where the functional dependency Aâ†’B exists, either A must be a super key or B must be a prime attribute for the table to be in the third normal form.

Letâ€™s consider the first dependency Aâ†’B. Here, neither A is super key, nor is B a prime attribute. Therefore, this relation is definitely not in the third normal form.

Considering another example with the table given below, our primary key is (CustID, Item) because all other transitive dependencies would be eliminated as they help identify all non-prime attributes in the table. The relation must have no transitive dependency, meaning there must be only one primary key that could help determine all the other attributes in the table.

In the following example, all non-primary columns are dependent on the primary key.

Let us look at our very first example now, the highlighted columns are primary keys:

seatNo

name

totalMarks

pass_fail

107

Emmy

83

pass

219

Robert

30

fail

302

Jeffry

46

pass

167

Angelina

79

pass

To convert the above-shown table to Third Normal Form, we will need to rearrange data in separate tables to remove the Transitive Dependency: So, we will first create a different table for seatNo, name and totalMarks where the seatNo will be the primary key, whereas, in a second table, we will have pass_fail with totalMarks as the primary key.

seatNo

name

totalMarks

03

metropolis

Brooke

12

suburb

Spectre

07

country

Paul

02

metro

Danny

totalMarks

pass_fail

83

pass

30

fail

46

pass

79

pass

We can now link the tables suitably and ensure that our relation has no transitive dependency, meaning we have finally accomplished the final step of normalization-the third normal form.

Therefore, the third normal form would require decomposing if the table was previously in 1NF or 2NF. Tables in the third normal form eliminate almost all redundancy in a table. Tables in 3NF are free of anomalies in insertion, deletion or updation. Following data integrity, relations also provide the most flexibility and efficiency possible within a relation in the DBMS.

Frequently Asked Questions

What is meant by DBMS normalization? Normalization in databases refers to organising data and involves creating relations and establishing links between them, and keeping in mind efficiency, protection and flexibility in the relations.

Why is the Third Normal Form the best normalization? Almost every database designer tries to reach 3NF in database normalization because relations in the third normal form have the least redundancy, the most flexibility and efficiency while keeping the data integrity and ensuring the data is protected.

What is meant by transitive dependency? When a non-prime attribute depends on another non-prime attribute instead of the primary key, or in other words when a dependent is indirectly dependent on the determinant, the relation is said to have a transitive functional dependency. We can write this as follows: If a â†’ b and b â†’ c, then a â†’ c.

What are the conditions for 3NF? For a table or a relation to be in the third normal form, it must first be in the second normal form and must not have any transitive dependencies.

Key takeaways

In this article, we learned about the third normal form and how it is implemented to make tables or relations ideally efficient. We also used examples to understand the concept better and learn to use our knowledge of the same. Keeping in mind the understanding of 3NF, we will further study the Boyce-Codd normal form.