Table of contents
1.
Introduction
2.
What is A* Search Algorithm?
3.
How A* Search Algorithm Works?
4.
Common heuristics include
5.
Step-by-Step Algorithm
6.
Implementation 
7.
Time & Space Complexity
7.1.
Time Complexity
7.2.
Space Complexity
8.
Example 
9.
Frequently Asked Questions
9.1.
What is the main advantage of using the A* algorithm over other pathfinding algorithms?
9.2.
Can A* guarantee finding the shortest path?
9.3.
How does the choice of a heuristic function affect the performance of A*?
10.
Conclusion
Last Updated: Aug 18, 2024
Medium

A* Search Algorithm Python

Author Rahul Singh
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

A* search algorithm is a popular pathfinding method that is used in computer science & artificial intelligence. It helps find the shortest path between two points, considering both the actual distance traveled & the estimated distance to the goal. A* is widely used in games, robotics, & navigation systems to efficiently plan routes & solve mazes. 

astar(A*) Search Algorithm Python

In this article, we will learn how the A* algorithm works, its step-by-step implementation in Python, & analyze its time & space complexity. 

What is A* Search Algorithm?


A* is an informed search algorithm that finds the shortest path between a starting node & a target node in a graph. It uses a heuristic function to estimate the cost from the current node to the goal, combining it with the actual cost from the start node to the current node. This allows A* to make informed decisions about which paths to explore first, prioritizing those that seem more promising based on the heuristic.

A* maintains an open set of nodes to be explored & a closed set of already visited nodes. It starts with the initial node in the open set & repeats the following steps until the goal is reached or the open set is empty:

 

1. Select the node with the lowest total cost (actual cost + heuristic cost) from the open set.
 

2. If the selected node is the goal, the algorithm terminates & returns the path.
 

3. Otherwise, move the selected node to the closed set & explore its neighbors.
 

4. For each neighbor, calculate the actual cost to reach it & the estimated cost to the goal using the heuristic function.
 

5. If a neighbor is already in the closed set & the new path is not better, ignore it.
 

6. If a neighbor is already in the open set & the new path is better, update its costs & parent.
 

7. If a neighbor is not in the open or closed set, add it to the open set with its costs & parent.

 

The heuristic function is problem-specific & should provide an estimate of the remaining cost to reach the goal. A common heuristic for grid-based problems is the Manhattan distance, which calculates the sum of the absolute differences in the x & y coordinates between two points.

How A* Search Algorithm Works?

The algorithm maintains two lists: an open list & a closed list. The open list contains the nodes to be explored, while the closed list keeps track of the already visited nodes.
 

Let’s discuss the A* algorithm step by step:
 

1. Initialize the open list with the starting node. Set its cost to zero & calculate its heuristic cost to the goal.
 

2. While the open list is not empty, do the following:

   a. Select the node with the lowest total cost (actual cost + heuristic cost) from the open list. Let's call this node the current node.
 

   b. If the current node is the goal node, terminate the algorithm & return the path from the start to the goal.
 

   c. Move the current node from the open list to the closed list.
 

   d. For each neighbor of the current node:
 

      - Calculate the actual cost to reach the neighbor through the current node.
 

      - Calculate the estimated cost from the neighbor to the goal using the heuristic function.
 

      - If the neighbor is already in the closed list & the new path is not better, ignore it.
 

      - If the neighbor is already in the open list & the new path is better, update its costs & set its parent to the current node.
 

      - If the neighbor is not in the open or closed list, add it to the open list, set its costs, & set its parent to the current node.
 

3. If the open list becomes empty & the goal node is not reached, there is no valid path from the start to the goal.
 

The key aspect of A* is the use of the heuristic function to guide the search towards the most promising paths. The heuristic function should be admissible, meaning it never overestimates the actual cost to reach the goal. This ensures that A* finds the optimal path if one exists.

Common heuristics include

- Manhattan distance: The sum of the absolute differences in the x & y coordinates between two points.
 

- Euclidean distance: The straight-line distance between two points.
 

- Diagonal distance: A combination of Manhattan & Euclidean distances, considering diagonal movements.

Step-by-Step Algorithm

Now that we understand how A* works, let's break it down into a step-by-step algorithm:

1. Initialize

   - Create an open list & add the starting node to it.

   - Create an empty closed list.

   - Set the cost of the starting node to zero & calculate its heuristic cost to the goal.
 

2. While the open list is not empty, repeat steps 3-7.
 

3. Select the node with the lowest total cost (actual cost + heuristic cost) from the open list. Call this node the current node.
 

4. If the current node is the goal node:

   - Terminate the algorithm.

   - Return the path from the start to the goal by tracing the parent nodes from the goal to the start.
 

5. Move the current node from the open list to the closed list.
 

6. For each neighbor of the current node:

   - Calculate the actual cost to reach the neighbor through the current node.

   - Calculate the estimated cost from the neighbor to the goal using the heuristic function.

   - If the neighbor is already in the closed list & the new path is not better, ignore it.

   - If the neighbor is already in the open list:

     - If the new path is better, update the neighbor's costs & set its parent to the current node.

   - If the neighbor is not in the open or closed list:

     - Add the neighbor to the open list.

     - Set the neighbor's costs (actual cost & heuristic cost).

     - Set the neighbor's parent to the current node.
 

7. If the open list becomes empty & the goal node is not reached, there is no valid path from the start to the goal. Terminate the algorithm.

Implementation 

Now, let’s look at the implementation of the A* algorithm in Python


from collections import deque

class Node:
    def __init__(self, position, cost, parent):
        self.position = position
        self.cost = cost
        self.parent = parent


def heuristic(a, b):
    # Manhattan distance heuristic
    return abs(a[0] - b[0]) + abs(a[1] - b[1])


def astar(start, goal, grid):
    rows, cols = len(grid), len(grid[0])
    open_list = deque([Node(start, 0, None)])
    closed_list = set()


    while open_list:
        current = min(open_list, key=lambda node: node.cost + heuristic(node.position, goal))
        open_list.remove(current)


        if current.position == goal:
            path = []
            while current:
                path.append(current.position)
                current = current.parent
            return path[::-1]

        closed_list.add(current.position)


        for dx, dy in [(0, 1), (0, -1), (1, 0), (-1, 0)]:
            x, y = current.position
            nx, ny = x + dx, y + dy
            if 0 <= nx < rows & 0 <= ny < cols & grid[nx][ny] == 0 & (nx, ny) not in closed_list:
                neighbor = Node((nx, ny), current.cost + 1, current)
                if neighbor not in open_list:
                    open_list.append(neighbor)
                else:
                    index = open_list.index(neighbor)
                    if open_list[index].cost > neighbor.cost:
                        open_list[index].cost = neighbor.cost
                       open_list[index].parent = current

    return None

Time & Space Complexity

Let's analyze the time & space complexity of the A* algorithm.

Time Complexity

The time complexity of A* depends on the heuristic function & the branching factor of the graph. In the worst case, A* may explore all the nodes in the graph before finding the goal, resulting in exponential time complexity. However, with a good heuristic function, A* can efficiently guide the search towards the goal, reducing the number of explored nodes.

 

The time complexity can be expressed as O(b^d), where:

- b is the branching factor (average number of neighbors per node)
 

- d is the depth of the optimal solution
 

In practice, the time complexity of A* is often better than the worst case, as the heuristic function helps prioritize promising paths & prune less optimal ones.

Space Complexity

The space complexity of A* is determined by the maximum number of nodes stored in the open & closed lists at any given time. In the worst case, A* may need to store all the nodes in the graph, resulting in a space complexity of O(n), where n is the total number of nodes.
 

However, the actual space complexity can be lower if the heuristic function is effective in guiding the search & pruning unnecessary nodes. The space complexity is bounded by the number of nodes expanded, which is usually a fraction of the total nodes in the graph.
 

It's important to note that the space complexity can be a limiting factor for large graphs or high-dimensional state spaces. In such cases, memory optimization techniques like iterative deepening A* (IDA*) or memory-bounded A* can be used to reduce the space requirements.

Example 

Now, to practice our learning of time and space complexity, let’s use the previous code example of implementation again : 

def astar(start, goal, grid):
    rows, cols = len(grid), len(grid[0])
    open_list = deque([Node(start, 0, None)])  # O(1) space
    closed_list = set()  # O(n) space in the worst case

    while open_list:  # O(b^d) time in the worst case
        current = min(open_list, key=lambda node: node.cost + heuristic(node.position, goal))  # O(n) time
        open_list.remove(current)  # O(n) time

        if current.position == goal:  # O(d) space for the path
            path = []
            while current:
                path.append(current.position)
                current = current.parent
            return path[::-1]

        closed_list.add(current.position)  # O(1) time

        for dx, dy in [(0, 1), (0, -1), (1, 0), (-1, 0)]:  # O(b) time
            x, y = current.position
            nx, ny = x + dx, y + dy
            if 0 <= nx < rows & 0 <= ny < cols & grid[nx][ny] == 0 & (nx, ny) not in closed_list:
                neighbor = Node((nx, ny), current.cost + 1, current)
                if neighbor not in open_list:  # O(n) time
                    open_list.append(neighbor)
                else:
                    index = open_list.index(neighbor)  # O(n) time
                    if open_list[index].cost > neighbor.cost:
                        open_list[index].cost = neighbor.cost
                        open_list[index].parent = current

    return None


In this code :

- The open list is initialized with the start node, taking O(1) space.
 

- The closed list can potentially store all the nodes in the worst case, taking O(n) space.
 

- The main loop runs in O(b^d) time in the worst case, where b is the branching factor & d is the depth of the optimal solution.
 

- Inside the loop, finding the minimum cost node takes O(n) time, & removing it from the open list takes O(n) time.
 

- Checking if a neighbor is in the open list or closed list takes O(n) time.
 

- The space complexity of the path is O(d) in the worst case, as it stores the nodes from the start to the goal.
 

Overall, the time complexity is O(b^d), & the space complexity is O(n) in the worst case, but it can be lower with an effective heuristic function.

Frequently Asked Questions

What is the main advantage of using the A* algorithm over other pathfinding algorithms?

The main advantage of A* is its ability to efficiently find the optimal path by using a heuristic function to guide the search towards the most promising paths, reducing the number of explored nodes compared to uninformed search algorithms like Breadth-First Search (BFS) or Depth-First Search (DFS).

Can A* guarantee finding the shortest path?

Yes, A* guarantees finding the shortest path if the heuristic function is admissible, meaning it never overestimates the actual cost to reach the goal. An admissible heuristic ensures that A* explores the paths in the optimal order & finds the shortest path.

How does the choice of a heuristic function affect the performance of A*?

The choice of heuristic function greatly impacts the performance of A*. A well-designed heuristic that accurately estimates the cost of the goal can significantly reduce the number of explored nodes & improve the algorithm's efficiency. However, if the heuristic overestimates the cost, A* may not find the optimal path, while an underestimating heuristic may lead to exploring unnecessary nodes.

Conclusion

In this article, we have talked about the A* search algorithm, a popular pathfinding technique used in computer science & artificial intelligence. We learned how A* works by maintaining an open list of nodes to explore & a closed list of visited nodes, with the help of a heuristic function to guide the search towards the most promising paths. We also discussed the step-by-step implementation of A* & analyzed its time & space complexity with the help of a code example for practice.


You can also check out our other blogs on Code360.

Live masterclass