Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
Validating a Functional Dependency
3.
Types of Functional Dependencies
3.1.
Trivial Functional Dependency
3.2.
Non-trivial Functional Dependency
3.3.
Multivalued Functional Dependency
3.4.
Transitive Functional Dependency
4.
Advantages of Functional Dependencies
5.
Disadvantages of Functional Dependencies
6.
Functional Dependency Set
7.
Properties of Functional Dependencies
8.
Attribute Closure
8.1.
Finding Closure of an attribute set
8.2.
Finding Candidate Keys and Super Keys using the Attribute Closure
9.
Advantages of Attribute Closure
10.
Disadvantages of Attribute Closure
11.
Prime and Non-Prime Attributes
12.
Frequently Asked Questions
12.1.
What is closure in functional dependency?
12.2.
How to find the closure of f?
12.3.
How to take closure in DBMS?
12.4.
What is the difference between DBMS and RDBMS?
12.5.
What is a relation in RDBMS?
12.6.
What is the relationship between candidate and superkey?
13.
Conclusion
Last Updated: Jun 3, 2024
Easy

Functional Dependencies and Attribute Closure

Author Apoorv Dixit
1 upvote
Master Python: Predicting weather forecasts
Speaker
Ashwin Goyal
Product Manager @

Introduction

Functional dependency can be defined as the method that describes the relationship between the attributes in a given relation. It means we represent how the different attributes in a dataset table are related to each other with the help of functional dependencies. Suppose there is functional dependency A→B; it means 'A determines B' or 'B is determined by A .'Here 'A' is a determinant attribute, and 'B' is a determined attribute. We can also say that 'B' is dependent on 'A.' 

Functional Dependencies and Attribute Closure

Now let’s understand with the help of an example, suppose there is a functional dependency given STUDENT_ID→ STUDENT_NAME, as mentioned in the table below:

STUDENT_ID

STUDENT_NAME

1

David

2

David

Here in the above table, we see two students with the same name. How will we differentiate whether they are the same students or different ones? To uniquely identify this, we will refer to STUDENT_ID. Now we can clearly see that both have different IDs, which means both are different students, and each tuple dataset (all the other attributes) will be different. Here STUDENT_NAME is a determined attribute, and STUDENT_ID is a determinant attribute that solves confusions related to determinant attribute by uniquely identifying the tuple.

Validating a Functional Dependency

Now, let's see how functional dependencies can help to validate a data set with a few examples. It will help us to understand the concept more clearly.

Table 1

STUDENT_ID

STUDENT_NAME

1

Bob

2

David

In the above table, the functional dependency from STUDENT_ID → STUDENT_NAME is valid since both students have different names and ids.

Table 2

STUDENT_ID

STUDENT_NAME

1

Bob

1

Bob

In the above table, functional dependency STUDENT_ID → STUDENT_NAME is valid since both students are the same, and there is redundant data in the table.

Table 3

STUDENT_ID

STUDENT_NAME

1

Bob

2

Bob

In the above table, functional dependency STUDENT_ID → STUDENT_NAME is valid since both students are different, and STUDENT_ID helps identify it uniquely.

Table 4

STUDENT_ID

STUDENT_NAME

1

Bob

1

David

In the above table, functional dependency STUDENT_ID,→ STUDENT_NAME is not valid since both students are different and cannot have the same STUDENT_ID. 

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Types of Functional Dependencies

They are four types of functional dependencies -

  1. Trivial Functional Dependency
  2. Non-trivial Functional Dependency
  3. Multivalued Functional Dependency
  4. Transitive Functional Dependency

 Now, let’s understand what these are: 

Trivial Functional Dependency

Let's assume there is functional dependency A→B. If it is a trivial functional dependency, it means that 'B is a subset of A.' It confirms that trivial dependencies are always valid because the attribute to be determined is the subset of the left-hand side attribute. Here reflexive relation holds. 

For example, STUDENT_ID→STUDENT_ID is a trivial functional dependency; it will always be valid, validity and invalidity of a functional dependency have already been discussed above.

There is another method to determine trivial functional dependency: if we take the intersection of left and right attributes, it will never be empty('Ⲫ'). 

For example (STUDENT_ID, STUDENT_NAME)—>STUDENT_NAME, LHS and RHS intersection is not null.

Non-trivial Functional Dependency

A non-trivial functional dependency A→B means that 'B is not a subset of A' and  A intersection B will be 'null' or 'Ⲫ.'

For example, a functional dependency STUDENT_ID→STUDENT_NAME is a non-trivial dependency since STUDENT_NAME is not a subset of STUDENT_ID and the intersection of both of these will be 'Ⲫ';

In non-trivial dependency cases of validity and invalidity arise, trivial ones are always valid.

Multivalued Functional Dependency

When two attributes in a table are independent of each other but both rely on a third attribute, this is referred to as multivalued dependency.

A multivalued dependency consists of at least two attributes that are dependent on a third attribute, which is why at least three attributes are always required.

Let’s see an example of the ‘Student’ table: 

STU_IDCOURSEPASSING_YEAR
1Science2020
2Science2021

Here, STU_ID can determine both COURSE and PASSING_YEAR. STU_ID→{ COURSE, PASSING_YEAR }, but there is no functional dependency between COURSE and PASSING_YEAR. Hence we can say that COURSE and PASSING_YEAR both are independent of each other which makes them a multivalued dependent on STU_ID. 

Transitive Functional Dependency

A functional dependency that is indirectly formed by two functional dependencies is called transitive functional dependency.

For example, if A→BB→C holds true, then according to the axiom of transitivity, A→C will also hold true.

Let’s see an example of a ‘Student’ table: 

STU_IDCLASSLECTURE_HALL
17L202
26B101

Here, with the STU_ID we can determine CLASS, and with CLASS, we can determine the LECTURE_HALL number for that particular class. It means with the help of STU_ID, we can determine LECTURE HALL. Therefore STU_ID→LECTURE_HALL holds true.

Advantages of Functional Dependencies

Functional dependencies in a relational database schema offer several benefits:

  • Data Integrity: Functional dependencies help enforce data integrity by defining constraints that ensure each attribute's values are determined by another attribute or combination of attributes. This prevents inconsistent or invalid data from being stored in the database.
  • Normalization: Functional dependencies aid in the normalization process, which involves organizing data into tables to reduce redundancy and dependency. By identifying and eliminating redundant data through normalization, databases become more efficient, easier to maintain, and less prone to anomalies.
  • Query Optimization: Understanding functional dependencies allows database systems to optimize query execution by recognizing redundant attributes that can be eliminated from queries. This optimization reduces the amount of data retrieval and processing required, leading to faster query execution times.
  • Data Consistency: By ensuring that attributes depend on specific combinations of other attributes, functional dependencies promote data consistency. This consistency helps maintain the accuracy and reliability of the database, ensuring that changes to one attribute do not inadvertently affect others.

Disadvantages of Functional Dependencies

  • Complexity: Managing and enforcing functional dependencies in large databases can be complex and time-consuming. As the number of attributes and dependencies increases, it becomes challenging to ensure that all dependencies are properly maintained and enforced.
  • Performance Overhead: Enforcing functional dependencies may introduce performance overhead, particularly during data modification operations such as insertion, update, and deletion. Maintaining referential integrity constraints and enforcing dependency rules can impact the performance of database transactions.
  • Dependency Maintenance: Functional dependencies need to be updated and maintained as the database schema evolves over time. This requires careful analysis and potentially modifying existing dependencies to accommodate changes in business requirements or data structures.
  • Potential for Over-Normalization: Over-reliance on functional dependencies for normalization can lead to over-normalization, where the database schema becomes excessively decomposed. This can result in increased join operations and decreased query performance, especially for complex queries involving multiple tables.

Functional Dependency Set

The functional Dependency set(or FD set) of a relation is the set of all functional dependencies present in the given relation. 

For example, in the table given below FD set will be

STU_IDSTU_NAMESTU_AGESTU_STATESTU_CONTRYSTU_NO

A valid FD set for the above relation can be:

{

  1. STU_ID→STU_NAME, 
  2. STU_ID→STU_AGE, 
  3. STU_ID→STU_NO, 
  4. STU_STATE→STU_COUNTRY

}

Here, STU_ID is a ‘determinant attribute’ that can determine and validate STU_NAME, STU_AGE, STU_NO. Similarly, STU_STATE→STU_COUNTRY is a functional dependency where STU_STATE can determine and validate STU_COUNTRY. So set of all these dependencies will form an FD set.

Properties of Functional Dependencies

Some essential properties of functional dependencies are:

1. Reflexive: if B is a subset of A, then A→B.

If X ⊇ Y then X → Y    // Reflexive property

For example {STU_ID, NAME} →NAME is valid reflexive relation.

2. Augmentation: if A→B then AC→BC for any C.  

For example {STU_ID, NAME} →{ DEPT_BUILDING} is valid then {STU_ID, NAME,DEPT_NAME} →{ DEPT_BUILDING,DEPT_NAME} is also valid. 

3. Transitive: if A→B and B→C, then A→C.

For example, if STU_ID→CLASS, CLASS→LECTURE_HALL holds true then according to the axiom of transitivity, STU_ID→LECTURE_HALL will also hold true. 

4. Union:  if A→B and A→C, then A→BC. 

For example, STU_ID → STU_NAME, STU_ID→COURSE then STU_ID→ {STU_NAME, COURSE}  holds true.  

5. Decomposition: if A→BC, then A→B, and A→C . 

For example STU_ID→ {STU_NAME, COURSE} then STU_ID → STU_NAME, STU_ID→COURSE  holds true. 

After understanding the functional dependencies and their types, let's know what an attribute closure is: 

Attribute Closure

Attribute Closure of an attribute set is defined as a set of all attributes that can be functionally determined from it. 

The closure of an attribute is represented as +

Finding Closure of an attribute set

You can follow the steps to find the Closure of an attribute set:

  1. Determine A+, the Closure of A under functional dependency set F. 
  2. A+: = will contain A itself; For example, if we need to find the closure of an attribute X, the closure will incorporate the X itself and the other attributes that the X attribute can determine. 
  3. Repeat the process as
  4. old A+: = A Closure;
  5. for each FB X → Y in the FD set, do
  6. if X Closure is a subset of X, then A Closure:= A Closure U Y;
  7. Repeat until ( A+= old A+);

For example,

Given a relation R(A,B,C,D) and FD { A→B, B→C, C→D}, then determine the A+

  1. A+=A, since A can determine A itself.
  2. A+=AB, A can also determine B, it is because in the FD set A→B dependency is given. 
  3. A+=ABC, A can also determine C with the help of B, since 
    A → B, B → C, thus, A → C // Transitive property 
  4.  A+=ABCD, A can also determine D with the help of the C attribute, since C is already determined now in the FD set functional dependency C→D holds, D can be determined with C.
    Therefore,
    A+(A closure) =ABCD. A can determine all attributes (ABCD).

Now let's see some questions for a clear understanding of the concept.

Question: In a schema with attributes A, B, C, D, E following set of functional dependencies are given

{ A→B, A→C, CD→E, B→D, E→A},

Which of the following dependencies is not implied by the above set? (GATE CS 2005).

  1. CD→AC
  2. BD→CD
  3. BC→CD
  4. AC→BC

Solution:

Checking option A, CD→AC,

  1. CD can determine C and D(can determine themselves) and E(as CD→E, given in FD set) . Therefore, (CD)+=CDE
  2. Now E can determine A (as E→A), (CD)+=CDEA
  3. With the help of A, we can determine B, (CD)+=CDEAB

So,(CD)+=ABCDE

We can clearly see that with CD, AC can be determined. Therefore, CD→AC holds true.

 

Checking for Option B, BD→CD,

On taking BD closure (BD)+=BD as they can determine themselves only, no other dependencies in the FD set.

We can see that BD→CD does not hold; hence B is the correct option.

Other options can be checked similarly by taking closure.

Another example: 

Question: The following functional dependencies are given:

 {AB→CD, AF→D, DE→F, C→G, F→E, G→A}

Which one of the following options is false? (GATE 2006)

  1. CF+ = {ACDEFG}                             
  2. BG+ = {ABCDG}
  3. AF+ = {ACDEFG}                      
  4. AB+ = {ABCDFG}

Solution:

If we take the attribute closure of option A, we will get, (CF)+= {ACDEFG}   

If we take the attribute closure of option B, we will get,(BD)+={ABCDG}

This can be done with the steps discussed above in the article.

But option C and D have attribute closure: (AF)+=(AFDE) and (AB)+=(ABCDG).

Therefore, options C and D are false.

Finding Candidate Keys and Super Keys using the Attribute Closure

A candidate key is the minimal set of attributes that can uniquely identify a tuple. For example, STU_ID in the 'Student' relation can be a candidate key. The candidate key should be unique and not null. 

A superkey is also a set of attributes that can uniquely identify a tuple in a given relation. The concept of the candidate key and the super key is closely related. Super key reduced to the minimum number of attributes makes the candidate key, which means the super key is the superset of a candidate key. In a given relation, ‘Student’ STU_ID and STU_NAME can be super key.

The attribute closure method can also be used to find all the candidate keys and super keys in the given relation. 

If the attribute closure of an attribute set contains all the attributes of the given relation, then the attribute set will be the superkey of the relation.

Similarly, if no subset of this attribute set can functionally determine all relation attributes, the present set will be a candidate key.

For example given a relation R(A,B,C,D) and FD { A→B, B→C, C→D,D→A}, then 

A+=BCDA,

B+=ACDB

C+=ACDB

D+=ACDB

Since all A, B, C, D can functionally determine all the attributes, so candidate key set will be {A, B, C, D}. With the help of this, we can also determine prime and non-prime attributes.

Prime attributes are the set of all attributes present in the candidate key, and non-prime attributes are the remaining attributes of the relation.

So here, Prime attributes are  {A, B, C, D} and there is no non-prime attribute(non-prime attribute set will be empty {Ⲫ}).

Superkeys can be formed by combining other attributes with the candidate key. Suppose AB can uniquely identify all the other attributes in the given, and A alone can do this, so AB be super key and A will be candidate key

GATE Question: Consider a relation scheme R = (A, B, C, D, E, H) on which the following functional dependencies hold: {A–>B, BC–>D, E–>C, D–>A}. What are the candidate keys of R?

  1.  AE, BE
  2. AE, BE, DE
  3. AEH, BEH, BCH
  4. AEH, BEH, DEH

Solution: A set of attributes S is a candidate key of relation R if the closure of S is all attributes of R and there is no subset of S whose closure is all attributes of R.

If we look closely, attributes E and H are not determined by any of the dependencies given in the FD set. Therefore they need to be present on the left-hand side. Only option D satisfies the given condition. Therefore, we can check for attribute closure.

(AEH)+={ABCDEH}

(BEH)+={ABCDEH}

(DEH)+={ABCDEH}

Option D is correct.

Advantages of Attribute Closure

  • Determining Functional Dependencies: Attribute closure is used to determine the functional dependencies present in a relation. By computing the closure of a set of attributes, we can identify all attributes that are functionally dependent on the given set, aiding in database normalization and integrity.
  • Database Design: Attribute closure helps in designing well-structured and normalized database schemas. By understanding the dependencies between attributes, database designers can optimize the schema design to minimize redundancy and ensure data integrity.
  • Query Optimization: Attribute closure provides valuable information for query optimization. By knowing which attributes are functionally dependent on others, query optimizers can generate efficient execution plans, leading to faster query processing times.
  • Data Integrity: Attribute closure helps maintain data integrity by ensuring that data dependencies are properly enforced. By identifying and enforcing functional dependencies, we can prevent data anomalies such as insertion, update, and deletion anomalies.

Disadvantages of Attribute Closure

  • Complexity: Computing attribute closure for large sets of attributes or complex dependencies can be computationally expensive and time-consuming. As the number of attributes and dependencies increases, the complexity of computing closures grows, potentially impacting performance.
  • Dependency Maintenance: Attribute closure needs to be updated and maintained as the database schema evolves over time. This requires careful analysis and potentially modifying existing closures to accommodate changes in business requirements or data structures.
  • Dependency Inference: In some cases, inferring all functional dependencies from a given set of attributes may result in an excessive number of dependencies, including trivial or redundant ones. Managing and enforcing such dependencies can lead to unnecessary complexity and overhead.
  • Over-Normalization: Over-reliance on attribute closure for normalization can lead to over-normalization, where the database schema becomes excessively decomposed. This can result in increased join operations and decreased query performance, especially for complex queries involving multiple tables.

Prime and Non-Prime Attributes

  • Prime Attributes: Prime attributes are attributes that are part of any candidate key for a relation. They play a crucial role in determining the minimal set of attributes required to uniquely identify each tuple in a relation. Prime attributes are essential for database normalization and maintaining data integrity.
  • Non-Prime Attributes: Non-prime attributes are attributes that are not part of any candidate key for a relation. They are functionally dependent on prime attributes and can be derived from them through attribute closure. Non-prime attributes contribute to the overall information content of the relation but are not involved in identifying tuples uniquely.

Frequently Asked Questions

What is closure in functional dependency?

Closure in functional dependency refers to the set of all attributes that can be functionally determined by a given set of attributes.

How to find the closure of f?

To find the closure of set of attributes 𝑓, compute the transitive closure by repeatedly applying functional dependencies until no new attributes are added.

How to take closure in DBMS?

In DBMS, to take closure of a set of attributes, apply the given functional dependencies to determine all attributes functionally dependent on the given set.

What is the difference between DBMS and RDBMS?

DBMS(Database Management System) is software that is used to maintain a database and RDBMS(Relational Database Management System) is an advanced version of DBMS. The major difference is that DBMS stores data as files, and RDBMS stores data in tabular form.

What is a relation in RDBMS?

A relation is a set of related attributes with identifying or key attributes. It refers to a tuple when referring to a single entity or row and a table when referring to the number of rows of the same relation. So basically, a relation is named, two-dimensional table of data.

What is the relationship between candidate and superkey?

A super key is a superset of a candidate key. It can be defined as a set of an attribute that can uniquely identify a tuple. 

Conclusion

In this article, we have learned about functional dependencies and their types and also seen different examples of validity and invalidity of functional dependencies. We have also understood what attribute closure is the method to find attribute closure and how this can be used to determine candidate key, superkeys, prime and non-prime attributes.

Recommended Readings:

Ninja, don't stop here; check out the Top 100 SQL Problems to get hands-on experience with frequently asked interview questions and land your dream job. You can visit Coding Ninjas Studio to practice programming problems for your complete interview preparation. You can also refer to the guided path to learn more about DBMS and SQL queries.

Next article
Transitive Dependency in DBMS
Live masterclass