When we are introduced to the world of programming, sorting problems are our frequent encounters. We learn many sorting approaches and algorithms for these types of problems. We will discuss an efficient sorting algorithm that you will use throughout your programming career.

This article will discuss Merge Sort in Python. Merge sort is a sorting algorithm that uses a divide-and-conquer strategy to sort an array or list. It works by dividing the array into two equal parts, sorting each part separately, and then merging the sorted parts back together. We will be discussing merge sort in Python from scratch.

What is Merge Sort in Python

Merge Sort is a sorting algorithm that divides the array into halves until it's comprised of single-element subarrays. It then merges these subarrays in a sorted manner, ultimately producing a fully sorted array. The process employs a divide-and-conquer strategy, recursively breaking down and merging arrays. Merge Sort's time complexity is O(n log n), making it efficient for large datasets. It's often preferred for its stability and predictable performance.

Divide and Conquer Approach

Divide and conquer is a strategy or technique used to solve problems by breaking them down into smaller, more manageable subproblems. The idea is to divide the problem into smaller parts that are easier to solve, then solve each part separately, and finally combine the solutions of the smaller parts to find the solution to the original problem. This strategy is used in many different areas, including algorithm design, computer science, mathematics, and more.

The divide and conquer approach is often used to solve problems with a recursive structure, which means that the problem can be broken down into smaller subproblems similar to the original problem. This approach can be applied to a wide range of problems, such as sorting algorithms (e.g. merge sort, quick sort), recursive algorithms (e.g. the Fibonacci sequence, the Tower of Hanoi), and more complex problems, such as the closest pair of points problem, the travelling salesman problem and many more.

Divide and conquer algorithms generally have a time complexity of O(n log n). It makes them quite efficient, but it also depends on the problem.

How does Divide and Conquer Work?

Consider an array of n unsorted integers. Our output should be a sorted array.

Divide: To divide the array, use the formula mid = len(array)//2 to get the array's midpoint

Conquer: In this step, we divide the array into sub-arrays using the computed midpoint. This sub-array division and midpoint computation is performed recursively. A single element in an array, as we all know, is always sorted. As a result, you must divide the sub-arrays indefinitely until all of the elements in the array are single array elements

After this stage, the array items are combined back together to produce a final sorted array.

MergeSort Algorithm

Let us consider the array {8,3,4,6,1,5,7,2}

Check if the left index is less than the right index.

Calculate the midpoint of the array if the low index is less than the high index.

Call the mergesort function on the left and right halves of the array.

Merge the two sorted halves using the merge function.

Merge function creates two different arrays and copies the left and right halves into these arrays.

Iteration and comparison of both arrays are done.

After merging, the arrays are sorted in ascending order.

Example of Merge Sort

The merge sort function first checks the base case where the array has a length of 1. If yes, then it returns the array. Otherwise, it divides the array into two parts and calls merge sort recursively on both parts. Finally, it merges both parts using the merge function, which takes two sorted arrays and returns a sorted array.

Pseudo code for Merge sort

procedure mergeSort(A: array of items, low, high)
if (low < high) then
mid = (low + high) / 2
mergeSort(A, low, mid)
mergeSort(A, mid + 1, high)
merge(A, low, mid, high)
procedure merge(A: array of items, low, mid, high)
create two arrays L[mid - low + 1] and R[high - mid]
for i = 0 to mid - low do
L[i] = A[low + i]
for j = 0 to high - mid - 1 do
R[j] = A[mid + j + 1]
i = 0, j = 0, k = low
while i < length(L) and j < length(R) do
if L[i] <= R[j] then
A[k] = L[i]
i = i + 1
else
A[k] = R[j]
j = j + 1
k = k + 1
while i < length(L) do
A[k] = L[i]
i = i + 1
k = k + 1
while j < length(R) do
A[k] = R[j]
j = j + 1
k = k + 1

Explanation

The mergeSort function takes array A of items and the indices ‘low’ and ‘high’ of the portion of the array that should be sorted.

The function first checks if ‘low’ is less than ‘high’. If it is, the function calculates the middle index ‘mid’ of the current portion of the array and then recursively calls itself with the subarrays ‘A[low...mid]’ and ‘A[mid+1...high]’. This process continues until the entire array is divided into single-item subarrays.

When the recursive calls begin to return, the merge function is called to merge the sorted subarrays back together. The merge function creates two arrays, L and R, which will temporarily store the subarrays. The function then compares the first element of L and R, places the smaller of the two at the first position of the A array, and moves the pointer of that array.

Then it continues with the next elements of L and R until one of the subarrays is finished. Then it copies the remaining elements from the other array to the A array. The merge function is called repeatedly until the entire array is sorted.

In the above merge sort in python program, the merge_sort function takes in an array and recursively splits it into smaller arrays. It will be done until each array has only one element. Then the merge function is used to merge the sorted arrays back together. The merge function takes in two arrays and compares the first element of each array, adding the smaller element to the result and moving on to the next element until one of the arrays is exhausted. Then it adds the remaining elements of the other array to the result.

Time Complexity

The time complexity of the above merge sort in python program is O(n log n). Where n is the number of elements in the list.

Space Complexity

The space complexity of the above merge sort in python program is O(n), where n is the number of elements in the list.

Additionally, it takes O(log n) recursive calls, which take up O(log n) stack space.

It requires additional memory to store the two sublists used in the merge operation. It takes up O(n) space.

However, O(n + log n) is equivalent to O(n) space complexity.

E-commerce: Merge sort is a popular algorithm used in e-commerce applications to organise products based on diverse parameters like average customer rating, price, relevance, popularity, discounts etc.

Google PageRank: Merge Sort program is used in Google PageRank results to sort programs in the order of relevance to the query. According to the relevance, it sorts programs in ascending or descending order.

Inversion Count Problem: Merge Sort program uses the divide-and-conquer approach to sort a list in increasing or decreasing order.

External Sorting: Merge Sort is used in External sorting by sorting large amounts of data into memory more effectively.

Advantages of Merge Sort:

Merge sort allows a large set of data to be sorted quickly and efficiently.

Merge sort can access data in sequence, which makes the need for random access low.

It is a stable sorting program.

It has the same time complexity regardless of other test case scenarios.

Disadvantages of Merge Sort:

Merge Sort requires an array of constant size, just like that of the original one.

While working on large sets of data, merge sort is efficient. But when it comes to smaller data sets, it is comparatively slower.

Even after the list is sorted, merge sort goes through the whole process.

It uses more memory space for storing the sub-elements.

Frequently Asked Questions

What is the algorithm of the sort() method in Python?

The sort() method in Python uses TimSort algo. It is a hybrid sorting algorithm that combines the best aspects of both merge sort and insertion sort. It was developed by Tim Peters and was first introduced in Python 2.3.

How do you sort code in Python?

In Python, there are several ways to sort a list of elements. The most common and easiest way to sort a list is by using the built-in sort() method. It sorts the elements in ascending order. The sort() method modifies the original list in place and returns None.

What is the difference between sort () and sorted () in Python?

The main difference between the two is that sort() modifies the original list and returns None. Whereas sorted() creates a new list and returns it. Also, the sort() method can sort the list in place and permanently alter the original order of the list. Whereas sorted() creates a new list, leaving the original list unchanged.

How do you write a merge sort code in Python?

To implement merge sort in Python, recursively divide the list in half until you have single-element lists, then merge them in sorted order. This efficient technique provides an accurate way for sorting huge datasets.

Conclusion

In this article, we have discussed merge sort in Python. Merge Sort stands out as an efficient and stable sorting algorithm for arrays or lists in Python. Its divide-and-conquer approach divides the problem into smaller subproblems, making it suitable for large datasets. If you like to learn more, you can check out our articles:

If you have just started your learning process and are looking for questions from tech giants like Amazon, Microsoft, Uber, etc. For placement preparations, you must look at the problems, interview experiences, and interview bundles.

Nevertheless, consider our paid courses to give your career an edge over others!

Happy Learning!

Live masterclass

Master PowerBI using Netflix Data

by Ashwin Goyal, Product @ HRS Groups, Ex - Udaan, OYO