Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
Data in a warehouse are usually multidimensional, having measure and dimension attributes. Dimensional data modeling is a set of rules to design database tables structure for easier and faster data retrieval. It gives an idea of how to store data in database tables, making Database understandable and straightforward. We design this model to support end-user queries in the data warehouse.
A fact table contains the business process's measurements, metrics, or facts. It is located at the center of a star schema or a snowflake schema, whereas the dimension table stores all attributes or dimensions that describe the objects in a fact table.
Schemas in Dimensional Data Modeling
A database is made up of one or more tables, and the database schema is the set of relationships between all of the tables in the database. Dimensional data modeling is done on both a logical and physical level. Dimensional data modeling offers another layer to your data models, and it's compatible with a wide range of database management software.
We can get the following benefits from dimensional data modeling:
We can design queries that answer business-related inquiries. A query, in most cases, calculates some measure of performance in several business dimensions.
We can write SQL queries. Most RDBMS companies utilize the SQL language.
The measurements that quantify the business are physically separated from the descriptive components that categorize the business in a dimensional schema.
A physical or logical schema is referred to as a dimensional schema.
A physical dimensional schema is commonly shown as a star or snowflake schema, with the objects in the star or snowflake schema representing database tables.
The dimensional schema can even be represented as a single table or view, with all facts and dimensions included inside separate columns of that table or view.
The fact, measurements, and dimensions are represented as entities and attributes in a logical dimensional schema independent of the database vendor and may thus be transformed into a physical dimensional schema for any database vendor.
There are three types of commonly used schemas in Dimensional Data Modeling.
Star Schema
A star schema is a relational database schema with a single, central fact table surrounded by dimension tables.
2. Snowflake Schema
The snowflake schema comprises a single fact table linked to several dimension tables, each of which can be linked to another dimension table via a many-to-one connection.
3. Fact Constellation Schema
A fact constellation combines many fact tables that share dimension tables and can be viewed as a constellation of stars.
Star Schema (Star Join Schema)
The star schema is a multidimensional data representation format for relational databases. One or more fact tables link to any number of dimension tables in the star schema. The star schema is a subtype of the snowflake schema that is more efficient when dealing with simple queries.
The center of the Star consists of a fact table, and the points of the Star are the dimension tables. It is a star schema because the diagram resembles a star, with points radiating from a center.
Advantages of star schema
It is the simplest and easiest schema to understand design.
It optimizes the navigation through the database.
It is suitable for query processing on large data sets.
Snowflake Schema
The star schema is expanded into the snowflake schema. Each point of the Star generates additional points, or we can say that in the snowflake schema, the fact tables share a dimension, and that dimension table links with another dimension table. Star and snowflake schemas are most widespread in dimensional data warehouses and data marts, where retrieval speed is more important than data manipulation efficiency. As a result, the tables in these schemas are rarely normalized and are usually created at a normalization level lower than the third normal form.
Advantages of Snowflakes
Fewer redundancies due to the normalized dimension table.
Dimension table in the snowflake schema saves storage space and is easier to update and maintain.
Disadvantages of Snowflakes
It is a complex schema.
Fact Constellation Schema
It is also known as Galaxy Schema. In this, multiple facts tables share the dimension tables.
It is a widely used schema that is more complex than the star and snowflake schema.
Advantages of Fact Constellation Schema
It's a user-friendly schema that encourages people to use it.
Tables are subdivided into fact and dimensional to understand the relation between them.
Disadvantages of Fact Constellation Schema
It is a complex schema due to multiple fact tables.
It isn't easy to manage.
Dimension tables are enormous.
Difference between Star and snowflake schema
Star schema
Snowflake schema
Star schema is highly denormalized.
It has a category-wise single dimension table.
Star schema has more data dependency and redundancy.
There is no need for complicated joins in star schemas.
Dimension tables in snowflakes are highly normalized.
In the snowflake schema, dimension tables are further split into additional tables.
Snowflake schema has less data dependency and redundancy.
In the snowflakes schema, complicated joins are required.
Star Schema Vs. Snowflake Schema Vs. Fact constellation Schema
Star schema
Snowflake Schema
Fact Constellation Schema
It has a single large central fact table and one table for each dimension
Every fact points to one tuple in each dimension and has additional attributes.
Star scheme does not capture hierarchies directly.
It is a variant of the star schema model.
It has a single, large and central fact table and one or more tables for each dimension.
Dimension tables are normalized. Data is split into different dimensional tables.
Why do we need database schemas? Database schemas are very important because they help us visualize how a database should be structured in the most optimized way.
What is a fact table? A fact table contains the business process's measurements, metrics, or facts. It is located at the center of a star schema or a snowflake schema.
What is normalization? Normalization is the process of organizing data in a database. This involves creating tables and linking them according to principles that attempt to protect the data while also making the database more flexible by reducing redundancy and conflicting dependencies.
What is dimensional data modeling? Dimensional data modeling is a database design approach to support end-user queries in a data warehouse.
What is a dimension? Dimensions are logically related attributes that function as an axis for data modeling. A dimension table is a table related to each dimension and helps further the description of that dimension.
Key Takeaways
We have learned "dimensional data modeling" and " different kinds of schema'' in this blog.
The Dimensional model is a database design technique to support end-user queries in a data warehouse. Then we learned about schemas. A database schema is like a skeleton structure that represents the logical view of the entire database.
There are three types of schemas: Star, snowflake, and fact constellation.
Visit here to learn more about different topics related to database management systems.
Also, try Coding Ninjas Studio to practice programming problems for your complete interview preparation. Ninja, don't stop here; check out the Top 100 SQL Problems to get hands-on experience with frequently asked interview questions and land your dream job.