How A* Search Algorithm Works?
The algorithm maintains two lists: an open list & a closed list. The open list contains the nodes to be explored, while the closed list keeps track of the already visited nodes.
Let’s discuss the A* algorithm step by step:
1. Initialize the open list with the starting node. Set its cost to zero & calculate its heuristic cost to the goal.
2. While the open list is not empty, do the following:
a. Select the node with the lowest total cost (actual cost + heuristic cost) from the open list. Let's call this node the current node.
b. If the current node is the goal node, terminate the algorithm & return the path from the start to the goal.
c. Move the current node from the open list to the closed list.
d. For each neighbor of the current node:
- Calculate the actual cost to reach the neighbor through the current node.
- Calculate the estimated cost from the neighbor to the goal using the heuristic function.
- If the neighbor is already in the closed list & the new path is not better, ignore it.
- If the neighbor is already in the open list & the new path is better, update its costs & set its parent to the current node.
- If the neighbor is not in the open or closed list, add it to the open list, set its costs, & set its parent to the current node.
3. If the open list becomes empty & the goal node is not reached, there is no valid path from the start to the goal.
The key aspect of A* is the use of the heuristic function to guide the search towards the most promising paths. The heuristic function should be admissible, meaning it never overestimates the actual cost to reach the goal. This ensures that A* finds the optimal path if one exists.
Common heuristics include
- Manhattan distance: The sum of the absolute differences in the x & y coordinates between two points.
- Euclidean distance: The straight-line distance between two points.
- Diagonal distance: A combination of Manhattan & Euclidean distances, considering diagonal movements.
Step-by-Step Algorithm
Now that we understand how A* works, let's break it down into a step-by-step algorithm:
1. Initialize
- Create an open list & add the starting node to it.
- Create an empty closed list.
- Set the cost of the starting node to zero & calculate its heuristic cost to the goal.
2. While the open list is not empty, repeat steps 3-7.
3. Select the node with the lowest total cost (actual cost + heuristic cost) from the open list. Call this node the current node.
4. If the current node is the goal node:
- Terminate the algorithm.
- Return the path from the start to the goal by tracing the parent nodes from the goal to the start.
5. Move the current node from the open list to the closed list.
6. For each neighbor of the current node:
- Calculate the actual cost to reach the neighbor through the current node.
- Calculate the estimated cost from the neighbor to the goal using the heuristic function.
- If the neighbor is already in the closed list & the new path is not better, ignore it.
- If the neighbor is already in the open list:
- If the new path is better, update the neighbor's costs & set its parent to the current node.
- If the neighbor is not in the open or closed list:
- Add the neighbor to the open list.
- Set the neighbor's costs (actual cost & heuristic cost).
- Set the neighbor's parent to the current node.
7. If the open list becomes empty & the goal node is not reached, there is no valid path from the start to the goal. Terminate the algorithm.
Implementation
Now, let’s look at the implementation of the A* algorithm in Python
from collections import deque
class Node:
def __init__(self, position, cost, parent):
self.position = position
self.cost = cost
self.parent = parent
def heuristic(a, b):
# Manhattan distance heuristic
return abs(a[0] - b[0]) + abs(a[1] - b[1])
def astar(start, goal, grid):
rows, cols = len(grid), len(grid[0])
open_list = deque([Node(start, 0, None)])
closed_list = set()
while open_list:
current = min(open_list, key=lambda node: node.cost + heuristic(node.position, goal))
open_list.remove(current)
if current.position == goal:
path = []
while current:
path.append(current.position)
current = current.parent
return path[::-1]
closed_list.add(current.position)
for dx, dy in [(0, 1), (0, -1), (1, 0), (-1, 0)]:
x, y = current.position
nx, ny = x + dx, y + dy
if 0 <= nx < rows & 0 <= ny < cols & grid[nx][ny] == 0 & (nx, ny) not in closed_list:
neighbor = Node((nx, ny), current.cost + 1, current)
if neighbor not in open_list:
open_list.append(neighbor)
else:
index = open_list.index(neighbor)
if open_list[index].cost > neighbor.cost:
open_list[index].cost = neighbor.cost
open_list[index].parent = current
return None
Time & Space Complexity
Let's analyze the time & space complexity of the A* algorithm.
Time Complexity
The time complexity of A* depends on the heuristic function & the branching factor of the graph. In the worst case, A* may explore all the nodes in the graph before finding the goal, resulting in exponential time complexity. However, with a good heuristic function, A* can efficiently guide the search towards the goal, reducing the number of explored nodes.
The time complexity can be expressed as O(b^d), where:
- b is the branching factor (average number of neighbors per node)
- d is the depth of the optimal solution
In practice, the time complexity of A* is often better than the worst case, as the heuristic function helps prioritize promising paths & prune less optimal ones.
Space Complexity
The space complexity of A* is determined by the maximum number of nodes stored in the open & closed lists at any given time. In the worst case, A* may need to store all the nodes in the graph, resulting in a space complexity of O(n), where n is the total number of nodes.
However, the actual space complexity can be lower if the heuristic function is effective in guiding the search & pruning unnecessary nodes. The space complexity is bounded by the number of nodes expanded, which is usually a fraction of the total nodes in the graph.
It's important to note that the space complexity can be a limiting factor for large graphs or high-dimensional state spaces. In such cases, memory optimization techniques like iterative deepening A* (IDA*) or memory-bounded A* can be used to reduce the space requirements.
Example
Now, to practice our learning of time and space complexity, let’s use the previous code example of implementation again :
def astar(start, goal, grid):
rows, cols = len(grid), len(grid[0])
open_list = deque([Node(start, 0, None)]) # O(1) space
closed_list = set() # O(n) space in the worst case
while open_list: # O(b^d) time in the worst case
current = min(open_list, key=lambda node: node.cost + heuristic(node.position, goal)) # O(n) time
open_list.remove(current) # O(n) time
if current.position == goal: # O(d) space for the path
path = []
while current:
path.append(current.position)
current = current.parent
return path[::-1]
closed_list.add(current.position) # O(1) time
for dx, dy in [(0, 1), (0, -1), (1, 0), (-1, 0)]: # O(b) time
x, y = current.position
nx, ny = x + dx, y + dy
if 0 <= nx < rows & 0 <= ny < cols & grid[nx][ny] == 0 & (nx, ny) not in closed_list:
neighbor = Node((nx, ny), current.cost + 1, current)
if neighbor not in open_list: # O(n) time
open_list.append(neighbor)
else:
index = open_list.index(neighbor) # O(n) time
if open_list[index].cost > neighbor.cost:
open_list[index].cost = neighbor.cost
open_list[index].parent = current
return None
In this code :
- The open list is initialized with the start node, taking O(1) space.
- The closed list can potentially store all the nodes in the worst case, taking O(n) space.
- The main loop runs in O(b^d) time in the worst case, where b is the branching factor & d is the depth of the optimal solution.
- Inside the loop, finding the minimum cost node takes O(n) time, & removing it from the open list takes O(n) time.
- Checking if a neighbor is in the open list or closed list takes O(n) time.
- The space complexity of the path is O(d) in the worst case, as it stores the nodes from the start to the goal.
Overall, the time complexity is O(b^d), & the space complexity is O(n) in the worst case, but it can be lower with an effective heuristic function.
Frequently Asked Questions
What is the main advantage of using the A* algorithm over other pathfinding algorithms?
The main advantage of A* is its ability to efficiently find the optimal path by using a heuristic function to guide the search towards the most promising paths, reducing the number of explored nodes compared to uninformed search algorithms like Breadth-First Search (BFS) or Depth-First Search (DFS).
Can A* guarantee finding the shortest path?
Yes, A* guarantees finding the shortest path if the heuristic function is admissible, meaning it never overestimates the actual cost to reach the goal. An admissible heuristic ensures that A* explores the paths in the optimal order & finds the shortest path.
How does the choice of a heuristic function affect the performance of A*?
The choice of heuristic function greatly impacts the performance of A*. A well-designed heuristic that accurately estimates the cost of the goal can significantly reduce the number of explored nodes & improve the algorithm's efficiency. However, if the heuristic overestimates the cost, A* may not find the optimal path, while an underestimating heuristic may lead to exploring unnecessary nodes.
Conclusion
In this article, we have talked about the A* search algorithm, a popular pathfinding technique used in computer science & artificial intelligence. We learned how A* works by maintaining an open list of nodes to explore & a closed list of visited nodes, with the help of a heuristic function to guide the search towards the most promising paths. We also discussed the step-by-step implementation of A* & analyzed its time & space complexity with the help of a code example for practice.
You can also check out our other blogs on Code360.