Welcome to Coding Ninjas; Today, we are going to discuss a clause in SQL called “PARTITION BY”. In this article, we will discuss what PARTITION BY SQL is, its syntax, analytical functions and windowing, and its use cases. We will also discuss its limitations and alternatives.

Before moving on to the PARTITION BY, I will recommend you understand a clause in this article on ‘SQL Clauses and Its Types’.

What is the PARTITION BY Clause in SQL?

PARTITION BY is a clause in SQL that is used to divide the result into partitions or subsets to perform the computations. The computation is done on the basis of number of columns. PARTITION BY is used in conjunction with analytical functions.

PARTITION BY is a clause that can be used in operations where the analytical functions like. SUM(), AVG(), MIN(), MAX() and RANK() are involved. It is useful to use PARTITION BY, where the partition on the basis of some kind of ID may need.

This can be used in various SQL database systems such as SQL Server, Oracle, PostgreSQL, and SQLite.

Here is a basic syntax for using PARTITION BY SQL:

SELECT col1, col2, col3, ..., analytic_function() OVER (PARTITION BY partition_col1, partition_col2, partition_col3, ...)
FROM table_name

In the above syntax query, SELECT is used to get the given columns of the table (col1, col2, and col3) on which the analytical functions will be applied. Here are multiple columns (partition_col1, partition_col2, partition_col3) which show the data is to be partitioned by PARTITION BY clause.

Get the tech career you deserve, faster!

Connect with our expert counsellors to understand how to hack your way to success

User rating 4.7/5

1:1 doubt support

95% placement record

Akash Pal

Senior Software Engineer

326% Hike After Job Bootcamp

Himanshu Gusain

Programmer Analyst

32 LPA After Job Bootcamp

After Job Bootcamp

Analytical Functions in SQL

There are several analytical functions in SQL that you can perform on a row that is returned by a query.

Here are some of the analytical functions that are commonly used

RANK(): It is an analytical function that is used to assign the rank to the rows in a specific column, and rank may or may not be unique means if two rows are the same, the same rank will be assigned to them.

ROW_NUMBER(): This function is used to assign a unique row number to each row in the table on the basis on which the rows are retrieved from the table.

SUM(): As the name suggests, SUM() is an aggregate function that can be used to calculate the total of a specific column passed to this function.

MIN() and MAX(): Both of these functions are used to find the minimum and maximum values from a specific column that is passed from the table, respectively.

FIRST_VALUE() and LAST_VALUE(): These functions are used to get the first and last values from the partition, respectively.

NTH_VALUE(): This function is used to get a specific value of a row in a partition of the table.

More Examples

Calculating Average

Suppose we have a table ‘products_table’ with five products in it, and every product has its product_id’, ‘product_name’, and ‘product_price’, as you can see below table:

product_id

product_name

product_price

1

A

10

1

B

15

2

C

20

3

D

25

3

E

30

Now, we want to find the average price of the product with the same unique ‘product_id’. For this, we will use the analytical function AVG, which will help us to find the average price, and ‘product_id’ can be used to partition the table using PARTITION BY.

Here is the SQL command to perform this operation:

SELECT product_id, product_name, product_price,
AVG(product_price) OVER (PARTITION BY product_id) AS avg_price_by_id
FROM products_table;

The result table will be printed on the SQL Client or Database Management Tool that you’re using; the result table will be similar to this:

product_id

product_name

product_price

avg_price_by_id

1

A

10

12.5

1

B

15

12.5

2

C

20

20

3

D

25

27.5

3

E

30

27.5

Calculating Sum

In this example, we will be calculating the sum of the partition using the analytical function ‘SUM()’. Here is the table on which we will be calculating the sum on the basis of product_id.

product_id

product_name

product_price

1

A

50

1

A

75

2

B

100

3

C

25

2

B

50

1

A

100

Here, we partition by product_id using PARTITION BY and calculate the sum using SUM() with the column product_price. Here is the query to perform this operation:

SELECT product_id, product_name, SUM(product_price) OVER (PARTITION BY product_id) as total_sum
FROM products;

Here is the output table after this query:

product_id

product_name

product_price

total_sum

1

A

50

225

1

A

75

225

2

B

100

150

3

C

25

25

2

B

50

150

1

A

100

225

Finding Maximum Price

In this example, we will be finding the maximum value of the partition using the analytical function ‘MAX()’. We will find the maximum price of the products in a partition, and the partition will be done by product_id. Here is the table for a better understanding:

product_id

product_name

product_price

1

A

10.99

1

B

5.99

2

C

1.99

2

D

3.49

3

E

15.99

3

F

7.99

Here, we partition by product_id using PARTITION BY and find the maximum using MAX() with the column product_price. Here is the query to perform this operation:

SELECT product_id, product_name, product_price, MAX(product_price) OVER (PARTITION BY product_id) AS max_price
FROM products.

Here is the output table after this query:

product_id

product_name

product_price

max_price

1

A

10

10

1

B

5

10

2

C

1

3

2

D

3

3

3

E

15

15

3

F

7

15

Ranking the Products

In this example, we will use an analytical function called ‘RANK()’ by which we can assign the rank to the rows in a specific column. Here is the table on which we are going to perform this operation:

product_id

product_name

product_price

1

A

10

1

B

20

2

C

30

2

D

40

3

E

50

3

F

60

3

G

70

Here we are performing partition by product_id using PARTITION BY and finding the rank of each row using their product_price for comparison. Here is the query to perform this operation:

SELECT product_id, product_name, product_price,
RANK() OVER (PARTITION BY product_id ORDER BY product_price DESC) AS rank
FROM products;

PARTITION BY is a clause in SQL that is used to divide the result into partitions or subsets to perform the computations. The computation is done on the basis of number of columns. PARTITION BY is used in conjunction with analytical functions.

What is the difference between partition by and group by?

In order to partition result sets and enable calculations within each partition, SQL window functions require the "PARTITION BY" parameter. By merging rows with identical values into summary rows, "GROUP BY" aggregates data. Typically, aggregate functions like COUNT or SUM are used.

How to create partition in SQL?

Use the "PARTITION BY" clause in a window function in SQL to create a partition. Give specifics on the partitioning criteria that will separate result sets into logical parts for computation or analysis, improving the capabilities for data manipulation and reporting.

How to count partition by in SQL?

Use the "COUNT" function with the "PARTITION BY" phrase in SQL to count within partitions. The result set is divided into groups according to the given criteria, and the occurrences within each group are then counted independently.

Conclusion

PARTITION BY SQL is used to divide a result into partitions for the computations. It has analytical functions such as RANK(), DENSE_RANK(), and ROW_RANK(). In this article, we discussed what this clause is, how it works, and its syntax and discussed this clause by taking different examples. We also discussed several analytical functions in SQL.

As we have discussed, PARTITION BY is one of the clauses; If you want to read about clauses, I will recommend you to read this article on ‘SQL Clauses and Its Types’.