Table of contents
1.
Introduction
2.
Types of Anomalies
3.
Normalisation techniques
3.1.
1NF 
3.1.1.
Example of 1NF: 
3.2.
2NF
3.2.1.
Example for 2NF:
3.3.
3NF
3.3.1.
Example of 3NF:
3.4.
BCNF
3.4.1.
Example of BCNF:
4.
FAQs
5.
Key Takeaways
Last Updated: Mar 27, 2024

Example of Normalisation Conversions

Author Ankit Kumar
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Redundancy is a big issue in DBMS. It causes a lot of problems; one of those is Data anomalies. Data redundancy is a phenomenon that occurs inside a database or data storage system when the identical piece of data is stored in two different locations. This can refer to two distinct fields inside a single database or two distinct locations across numerous software environments or platforms. For example, When a name and address are both contained in distinct columns inside a database, If the relationship between these data points is specified in each and every new database entry, it will result in unnecessary duplication. And hence Data Anomalies. To counter this big issue, Normalisation was created to reduce the negative effects of creating tables that will introduce anomalies into the database.

Normalisation is the process of structuring data in a database in order to eliminate data redundancy, insertion anomalies, update anomalies, and deletion anomalies. Before moving forward, what do anomalies mean. Theoretically, an anomaly is something unexpected, an abnormality. Due to anomalies, the overall integrity of the data recorded in the database will deteriorate over time. These can happen naturally in DBMS and lead to unwanted results. Let's explore different types of anomalies and then go on to normal forms using examples. 

Types of Anomalies

We are now moving to the types of anomalies. There are three types of anomalies Update, Insertion, Deletion. Let's understand different types of anomalies with the following Table.

Student_ID Student_Name Student_Address Student_Grades Student_Courses
001 Aditya Hajipur  A Biology
002 Shahid Kushinagar A+ Biology
003 Abhay Hathwa B Biology
004 Ankit Hajipur A+ C.Sc
004 Ankit Chandigarh A+ C. Sc
005 Uttkarsh Dartbhanga   C.Sc
  1. Update Anomalies - Assume we have five columns in a table, two of which are student Name and student Address. Now, whenever one of the students changes locations, we must update the Table. For example, Ankit moves from Hajipur to Chandigarh, and the entry wasn't made. However, if the database is not normalised, one student may have many entries, and one of them may be missed while updating all of those entries.
     
  2. Insertion Anomalies - Assume we have a table with five columns. Student_ID, Student_Name, Student_Address, Student_Grades and Student_Courses are all required. When a new student enrols in school, the first three properties and last column can be filled in, but the fourth attribute will have a NULL value because he has no marks yet since Uttkarsh has joined the school but is yet to give a test, so his grades are bound to be empty. But Student_Grades does not allow NULL values.
     
  3. Deletion Anomalies - This anomaly suggests that crucial information was removed from the Table that was not necessary. Assume we have the following information about students and the courses they have taken: (ID, Name, Course, address). If a student departs the school, the entry for that student will be removed. However, that deletion will also erase the course information, despite the fact that the course is determined by the institution, not the student.

So, care to think about how Normalisation works here to solve the anomalies. Well, it tries to break tables into numerous tables and uses keys to describe relationships between them. In the next section, we will discuss normalisation techniques.

Normalisation techniques

Following are the techniques for Normalisation.

1NF 

The first normal form indicates that each cell in the Table must contain just one value. As a result, every intersection of rows and columns must include atomic values. 

Example of 1NF: 

Suppose a school hostel wants to store the names and contact details of its students. It creates a table that is following:

Student_Roll Student_Name Student_Address Student_Mobile
001 Aditya Hajipur  9431444444
002 Shahid Kushinagar

9944442222

9977332222

003 Abhay Hathwa 8823234411
004 Ankit Hajipur

8945456767

9123234545

005 Uttkarsh Darbhanga 8969690000

We can see that two students have two mobile numbers, so the hostel staff saved that detail. But now we have a problem as 1NF says, "each attribute of a table must have atomic (single) values". So, here Student_Mobile violates that rule. Now, the Table that follows 1NF will be like the following Table.

Student_ID Student_Name Student_Address Student_Mobile
001 Aditya Hajipur 9431444444
002 Shahid Kushinagar 9944442222
002 Shahid Kushinagar 9977332222
003 Abhay Hathwa 8823234411
004 Ankit Hajipur 8945456767
004 Ankit Hajipur 9123234545
005 Uttkarsh Darbhanga 8969690000

2NF

We discussed candidate key before, and now this is where it comes into play. The 2NF rule indicates that none of the Table's non-prime characteristics is reliant on any of the candidate keys. In other words, it must follow two conditions. i.e., Table is in 1NF, and No non-prime attribute is dependent on the proper subset of any table candidate key. An attribute that is not part of any candidate key is known as a non-prime attribute.

Example for 2NF:

In the school hostel, teachers' data. So after the Table is following 1NF

Teachers_ID Teahcers_Subjects Teachers_Age
1001 English 38
1002 Biolog 39
1003 Hindi 45
1003 Sanskrit 45
1004 Maths 25
1005 Civics 35
1005 History 35

Because each attribute has atomic values, the Table is in 1 NF. However, it is not in 2NF since the non-primary attribute Teachers_Age is reliant on Teachers_ID, which is a valid subset of the candidate key. Together Teachers_ID and Teachers_Subjects are forming a candidate key for this table. Which in turn can be the primary key for the table.

As the rule states, no non-prime attribute is dependent on the proper subset of any candidate key of the Table. This is a violation of the 2NF rule. So the Table that will follow 2NF will look like this.

To comply with 2NF, we can divide the table into two tables as follows:

Teachers_ID Teachers_Age
1001 38
1002 39
1003 45
1004 25
1005 35

Subject Table:

Teachers_ID Teahcers_Subjects
1001 English
1002 Biolog
1003 Hindi
1003 Sanskrit
1004 Maths
1005 Civics
1005 History

Now we move to the next type, 3NF

3NF

This rule specifies that tables must be in the 2NF format and that each table should only include columns that are not transitively dependent on their own table's primary key. The attribute unrelated to the candidate key is a non-prime attribute and vice-versa. 

A table is in 3NF if it is in 2NF, and for each functional dependency, X-> Y, at least one of the following conditions hold:

  • X is the super key of the Table
  • Y is a prime attribute of the Table

Example of 3NF:

The hostel details of students contain a ZIP code along with the address. The Table will have to follow the 2NF rule as the rule for 3NF states.

Student_ID Student_Name Student_Address Student_Vill Student_ZIP
001 Aditya Hajipur  Heerapur 820002
002 Shahid Kushinagar Tamkuhiraj 274000
003 Abhay Hathwa Hathwa 841435
004 Ankit Hajipur Babupur 823202
005 Uttkarsh Darbhanga Silvarat 847222

Super keys: {Student_ID}, {Student_ID, Student_Name}, {Student_ID, Student_Name, Student_ZIP}…so on

Candidate Keys: {Student_ID}

Non-prime attributes: all attributes except Student_ID are non-prime as they are not part of any candidate keys.

Here, Student_Address and Student_Vill are dependent on Student_ZIP. And, Student_ZIP is dependent on Student_ID that makes non-prime attributes (Student_Address and Student_Vil) transitively dependent on super key (Student_ID). This violates the rule of 3NF.

So, now to make this Table in 3NF, we remove the transitive dependency.

Student table:

Student_ID Student_Name Student_ZIP
001 Aditya 820002
002 Shahid 274000
003 Abhay 841435
004 Ankit 823202
005 Uttkarsh 847222

Stud_ZIP table:

Student_ZIP Student_Address Student_Vill
820002 Hajipur  Heerapur
274000 Kushinagar Tamkuhiraj
841435 Hathwa Hathwa
823202 Hajipur Babupur
847222 Darbhanga Silvarat

Now we will see the last one, BCNF.

BCNF

Since it is a more advanced variant of 3NF, it is sometimes known as 3.5NF. BCNF is more no-nonsense than 3NF. A table conforms with BCNF if it is in 3NF, and X is the Table's super key for every functional dependence X->Y.

Example of BCNF:

In the hostel, students play more than one sport, so now in the Table:

Student_ID Student_Game Sports_Type Student_Address
001 Football O1 Hajipur
002 Football O1 Kushinagar
002 Basketball O2 Kushinagar
003 Basketball O2 Hathwa
003 Badminton I1 Hathwa
004 Badminton I1 Hajipur
005 Football O1 Darbhanga
005 Table Tennis I2 Darbhanga

Functional dependencies in the table above: Student_ID -> Student_Address

Student_Game -> Sports_Type

Candidate key: {Student_Name, Student_Game}

The Table is not in BCNF as neither Student_ID nor Student_Game alone are keys.

To make the Table comply with BCNF, we can break the Table into three tables like this:

Stud_Add table:

Student_ID Student_Address
001 Hajipur 
002 Kushinagar
003 Hathwa
004 Hajipur
005 Darbhanga

Stud_Sport Table:

Student_ID Student_Game
001 Football
002 Football
002 Basketball
003 Basketball
003 Badminton
004 Badminton
005 Football
005 Table Tennis

Sports table:

Student_Game Sports_Type
Football O1
Basketball O2
Badminton I1
Table Tennis I2

Functional dependencies:

Student_ID -> Student_Address

Student_Game -> Sports_Type

Candidate keys:

For the first Table: Student_ID

For the second table: {Student_ID, Student_Sports}

For the third Table: Sports_Type

This is now in BCNF as in both the functional dependencies left side part is a key.

So, with this, we move to FAQs.

FAQs

  1. What is Normalisation?
    Normalisation is the process of structuring data in a database in order to eliminate data redundancy.
     
  2. What is Primary Key?
    Each entry in a table is uniquely identified by the Primary key. This value cannot be duplicated within a table and cannot include null values.
     
  3. How many types of Anomalies are there, and what is their name?
    There are three types of Anomalies. The three anomalies are as follows:
    Update Anomalies
    Insertion Anomalies
    Deletion Anomalies
     
  4. What is BCNF also called?
    BCNF is also called 3.5NF. Since it is a more advanced variant of 3NF, it is sometimes known as 3.5NF. A table conforms with BCNF if it is in 3NF, and X is the Table's super key for every functional dependence X->Y.
     
  5. What is Data redundancy?
    Data redundancy is a phenomenon that occurs inside a database or data storage system when the identical piece of data is stored in two different locations. This can refer to two distinct fields inside a single database or two distinct locations across numerous software environments or platforms.

Key Takeaways

In this article, we have extensively discussed Normalisation conversions. We started with the need for Normalisation, and then we learned about the types of anomalies. Well, after knowing about anomalies, we learnt about techniques of Normalisation. There are three types of Normalisation 1NF, 2NF, 3NF and BCNF, respectively. We also saw examples of them and how the tables are separated. In the last section, we learnt about the FAQs. 

Also, check out - Anomalies In DBMS.

We hope that this blog has helped you enhance your knowledge regarding Normalisation conversions and if you would like to learn more, check out our articles on Top 100 SQL Problems and if you would like to learn more, check out our articles on Coding Ninjas Studio. Do upvote our blog to help other ninjas grow. Happy Coding!

Live masterclass