Table of contents
1.
Introduction
2.
Best-first Search Algorithm (Greedy Search)
3.
Example
3.1.
Python
3.2.
Explanation of the Code
4.
Advantages of Best-first Search Algorithm
5.
Disadvantages of Best-first Search Algorithm
6.
Time and Space Complexity
6.1.
Time Complexity
6.1.1.
Best Case
6.1.2.
Worst Case
6.2.
Space Complexity
7.
A* Search Algorithm
8.
History of the A* Search Algorithm in Artificial Intelligence
9.
How the A* Search Algorithm Works
10.
Example 
10.1.
Python
10.2.
Explanation of the Code
11.
Applications of the A* Search Algorithm in Artificial Intelligence
12.
Advantages of A* Search Algorithm
13.
Disadvantages of A* Search Algorithm
14.
Time and Space Complexity
14.1.
Time Complexity
14.1.1.
Best Case: 
14.1.2.
Worst Case
14.2.
Space Complexity
15.
Considerations 
16.
Frequently Asked Questions
16.1.
What is the primary difference between Best-first Search and A Search?*
16.2.
Can A Search algorithm work with any type of heuristic?*
16.3.
Why might Best-first Search fail to find the shortest path?
17.
Conclusion
Last Updated: Oct 3, 2024
Easy

Informed Search Algorithms in Artificial Intelligence

Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Informed search algorithms are a type of search algorithm used in artificial intelligence & computer science. They use extra information to guide the search process & find the best solution faster. These algorithms are smart because they know more than just the problem itself. 

Informed Search Algorithms in Artificial Intelligence

In this article, we'll learn about different types of informed search algorithms, how they work, their pros & cons, & see some examples.

Best-first Search Algorithm (Greedy Search)

The Best-first search algorithm, commonly known as Greedy Search, is a strategic approach used in computer science to navigate complex networks by prioritizing paths that seem most promising to lead directly to the goal. This method leverages a heuristic function—essentially a rule or a set of rules—to make an educated guess about the path’s potential to reach the destination efficiently. The heuristic helps the algorithm to evaluate each option based on the proximity to the goal, optimizing the search process by following the most favorable route first.

How the Best-first Search Algorithm Works

  1. Initialization: Begin at the starting node (or state). Mark it as the current node.
     
  2. Node Evaluation: Utilize a heuristic to evaluate the desirability of each neighboring node based on their estimated distance to the goal.
     
  3. Path Selection: From the current node, choose the neighboring node with the highest desirability as determined by the heuristic. This node becomes the new current node.
     
  4. Record Path: Add the selected node to the path list, and mark it as visited to prevent the algorithm from revisiting it.
     
  5. Goal Check: Check if the current node is the goal node. If it is, the algorithm terminates and returns the path taken as the solution.
     
  6. Loop or Terminate: If the goal is not reached and there are still nodes to be explored, repeat steps 2 to 5. If no nodes are left or if a stop condition is met (like exceeding a time limit or reaching a maximum number of iterations), the algorithm stops without reaching the goal.

Example

In this example, we'll use a priority queue to always select the path that appears to be closest to the goal according to the heuristic. We'll use a straightforward graph where nodes represent cities, and the heuristic is the straight-line distance to the goal.

  • Python

Python

import heapq

# This represents a simple graph using an adjacency list where each node points to its neighbors and the cost to reach them

graph = {

   'A': {'B': 2, 'C': 3},

   'B': {'D': 1, 'E': 4},

   'C': {'F': 5},

   'D': {'G': 2},

   'E': {'G': 1},

   'F': {},

   'G': {}

}

# Heuristics is the straight-line distance from each node to the goal ('G')

heuristics = {

   'A': 6,

   'B': 4,

   'C': 5,

   'D': 2,

   'E': 1,

   'F': 7,

   'G': 0

}

# Function to perform best-first search

def best_first_search(graph, start, goal):

   # Open list is a priority queue that will store all nodes to be explored

   open_list = []

   # Start by adding the start node with its heuristic as the priority

   heapq.heappush(open_list, (heuristics[start], start))

   visited = set()  # Set to keep track of visited nodes



   while open_list:

       current_node = heapq.heappop(open_list)[1]  # Get the node with the lowest heuristic value


       # If the goal is reached, return success

       if current_node == goal:

           print(f"Goal reached: {current_node}")

           return


       # If not visited, consider its neighbors

       if current_node not in visited:

           visited.add(current_node)

           print(f"Visiting: {current_node}")

           for neighbor, cost in graph[current_node].items():

               if neighbor not in visited:

                   # Push the neighbor into the priority queue with updated cost based on heuristic

                   heapq.heappush(open_list, (heuristics[neighbor], neighbor))



   print("Goal not reachable")

   return


# Start the search from 'A' to 'G'

best_first_search(graph, 'A', 'G')
You can also try this code with Online Python Compiler
Run Code

Output

Visiting: A
Visiting: B
Visiting: E
Goal reached: G

Explanation of the Code

  1. Graph Representation: We define a graph using an adjacency list where each node has neighbors linked by a cost to reach them.
     
  2. Heuristic Function: Heuristics provide estimated distances to the goal. These are pre-calculated and used to prioritize nodes in the priority queue.
     
  3. Search Mechanism: The search uses a priority queue to always explore the node that seems closest to the goal next, based on the heuristic value.
     
  4. Termination: The search continues until the goal is reached or no nodes are left to explore. If the goal is reached, it prints the goal node; otherwise, it indicates that the goal is not reachable.

Advantages of Best-first Search Algorithm

  1. Focused Search: By always opting for the most promising path, the algorithm efficiently narrows down the search space, reducing the time spent exploring less likely paths.
     
  2. Quick Convergence: In scenarios where the heuristic is well-aligned with the actual distances or costs, Best-first search can converge to a solution much faster than uninformed search strategies.
     
  3. Flexibility with Heuristics: The algorithm can be adapted to different problems by changing the heuristic function, making it versatile across various applications.
     
  4. Reduced Memory Load: Although it can be memory intensive, compared to exhaustive searches, it generally requires less memory as it does not need to store all paths but only the promising ones.
     
  5. Easy to Implement: With the primary requirement being a well-defined heuristic, the Best-first search algorithm is straightforward to program, making it accessible for many developers.

Disadvantages of Best-first Search Algorithm

  1. Heuristic Dependence: The effectiveness of the algorithm heavily depends on the accuracy of the heuristic. If the heuristic is poor, the algorithm’s performance drastically declines.
     
  2. Non-Optimal Solutions: There is no guarantee that the solution found is the best possible solution; the algorithm might settle for a suboptimal path if it seems locally optimal.
     
  3. Potential for Missing Solutions: In densely connected search spaces, the algorithm might overlook better paths because it gets 'greedy' for the immediate benefits.
     
  4. Risk of Loops: Without proper checks, the algorithm can end up in infinite loops if it revisits already explored nodes.
     
  5. Space Complexity in Large Networks: In very large or complex networks, maintaining a record of viable paths can consume substantial memory, leading to high space complexity.

Time and Space Complexity

Time Complexity

Best Case

O(bm) where b is the branching factor of the tree (the average number of successors per state) and m is the maximum depth of the search tree. This is the best case, typically occurring when the path to the goal is direct and the heuristic is very informative.

Worst Case

Could escalate to O(bm) in scenarios where the goal is at the maximum depth or not reachable, requiring a full exploration of the search space.

Space Complexity

Generally, Best-first search is O(bm) as it may need to store all nodes in memory to manage and prioritize the frontier using a priority queue. This becomes significant in dense graphs with high branching factors.

A* Search Algorithm

The A* Search Algorithm is a powerful and widely-used approach in computer science for finding the shortest path from a start node to a goal node. It improves on the Best-first search by not only considering the cost from the start node to the current node but also including an estimate of the cost from the current node to the goal. This makes A* Search more effective at finding the most efficient path.

History of the A* Search Algorithm in Artificial Intelligence

The A* (pronounced "A-star") search algorithm is a widely used pathfinding algorithm in artificial intelligence and computer science. It was developed in 1968 by Peter Hart, Nils Nilsson, and Bertram Raphael, who were researchers at the Stanford Research Institute (now known as SRI International).

The A* algorithm is an extension of Edsger Dijkstra's shortest path algorithm, which was developed in 1959. Dijkstra's algorithm finds the shortest path between two nodes in a graph, but it doesn't take into account any additional information about the problem domain. The A* algorithm improves upon Dijkstra's algorithm by incorporating heuristics to guide the search towards the goal more efficiently.

The key idea behind the A* algorithm is to combine the actual cost of reaching a node (g-cost) with an estimated cost of reaching the goal from that node (h-cost). The sum of these two costs (f-cost) is used to determine the order in which nodes are explored. The algorithm maintains an open set of nodes to be evaluated and a closed set of nodes that have already been evaluated. At each step, the node with the lowest f-cost is selected from the open set, and its neighbors are added to the open set. This process continues until the goal node is reached or the open set is empty.

Since its introduction, the A* algorithm has been widely applied in various domains, like:

1. Pathfinding in video games and robotics
2. Route planning in navigation systems
3. Solving puzzles and games (e.g., Rubik's Cube, 8-puzzle)
4. Resource allocation and scheduling problems

How the A* Search Algorithm Works

The A* algorithm combines elements of uniform-cost search and pure heuristic search to efficiently compute paths. Here's a step-by-step breakdown of its operation:

  1. Initialization: Start with the initial node and calculate its total estimated cost, which is the sum of the actual cost from the start node to this node (known as 'g') and the estimated cost from this node to the goal (known as 'h', the heuristic).
     
  2. Node Selection: Place the initial node on an open list of nodes to be explored. This list is prioritized by the node's total estimated cost, the lower, the better.
     
  3. Exploration: Remove the node with the lowest cost from the open list and examine all its adjacent nodes (neighbors). For each neighbor, calculate its total cost ('f') using the formula: f=g+h.
     
  4. Cost Comparison: If a node with the same position as a neighbor is already in the open list with a lower 'f', skip updating that neighbor. If not, add (or update) the neighbor in the open list.
     
  5. Goal Check: Repeat the process until the goal node is removed from the open list, which means a path has been found, or the open list is empty, which means there is no path.
     
  6. Path Construction: Once the goal is reached, backtrack from the goal node to the start node to reconstruct the path using the nodes' parent pointers.

Example 

In this example, the algorithm will navigate a simple grid-like map to find the shortest path from a start point to a goal, using the Manhattan distance as the heuristic (which is suitable for grid-based maps).

  • Python

Python

import heapq

class Node:
"""A node class for A* Pathfinding"""

def __init__(self, parent=None, position=None):
self.parent = parent
self.position = position
self.g = 0 # Cost from start to node
self.h = 0 # Heuristic from node to goal
self.f = 0 # Total cost

def __eq__(self, other):
return self.position == other.position

def __lt__(self, other):
return self.f < other.f

def astar(maze, start, end):
"""Returns a list of tuples as a path from the given start to the given end in the maze"""

# Create start and end node
start_node = Node(None, start)
start_node.g = start_node.h = start_node.f = 0
end_node = Node(None, end)
end_node.g = end_node.h = end_node.f = 0

# Initialize both open and closed list
open_heap = []
closed_list = []

# Add the start node
heapq.heappush(open_heap, start_node)

# Loop until you find the end
while open_heap:
# Get the current node
current_node = heapq.heappop(open_heap)
closed_list.append(current_node)

# Found the goal
if current_node == end_node:
path = []
current = current_node
while current is not None:
path.append(current.position)
current = current.parent
return path[::-1] # Return reversed path

# Generate children
(x, y) = current_node.position
neighbors = [(x-1, y), (x+1, y), (x, y-1), (x, y+1)] # Adjacent squares

for next in neighbors:
# Check if the next position is within the bounds of the maze
if not (0 <= next[0] < len(maze) and 0 <= next[1] < len(maze[0])):
continue

# Get node position
map_value = maze[next[0]][next[1]]
if map_value == 0:
continue

# Create new node
neighbor = Node(current_node, next)

if neighbor in closed_list:
continue

# Create the f, g, and h values
neighbor.g = current_node.g + 1
neighbor.h = abs(neighbor.position[0] - end_node.position[0]) + abs(neighbor.position[1] - end_node.position[1])
neighbor.f = neighbor.g + neighbor.h

# Child is already in the open list
if add_to_open(open_heap, neighbor):
heapq.heappush(open_heap, neighbor)

return None

def add_to_open(open_list, neighbor):
for node in open_list:
if neighbor == node and neighbor.g >= node.g:
return False
return True

# A simple maze definition where 1 is passable, and 0 is an obstacle
maze = [
[1, 1, 1, 1, 0],
[0, 0, 1, 1, 1],
[1, 1, 1, 0, 1],
[1, 0, 1, 1, 1],
[1, 1, 1, 0, 1]
]

start = (0, 0)
end = (4, 4)

path = astar(maze, start, end)
print("Path:", path)
You can also try this code with Online Python Compiler
Run Code

Output

Path: [(0, 0), (0, 1), (0, 2), (1, 2), (2, 2), (3, 2), (3, 3), (3, 4), (4, 4)]

Explanation of the Code

  1. Node Class: Represents a node in the graph. Each node keeps track of its parent (for path reconstruction), its position, and the costs associated (g, h, and f).
     
  2. A Algorithm Function (astar)**: Handles the main logic of A. It uses a priority queue (open_heap) to manage the nodes being explored and a list (closed_list) to track nodes that have been fully processed.
     
  3. Heuristic Calculation: Uses Manhattan distance, suitable for grid navigation, to estimate the cost from any node to the goal.
     
  4. Path Reconstruction: Once the goal is reached, the path is reconstructed from the end node to the start node by following parent links.

Applications of the A* Search Algorithm in Artificial Intelligence

 

1. Pathfinding in Video Games and Robotics: A* is extensively used in video games and robotics for pathfinding. In video games, it help characters navigate through complex environments, avoiding obstacles and finding the shortest path to their destination. In robotics, A* is used for robot motion planning, enabling robots to find optimal paths in real-world environments while considering factors such as obstacle avoidance and terrain traversability.

2. Route Planning in Navigation Systems: A* is employed in navigation systems to find the optimal route between two locations. It takes into account factors such as distance, traffic congestion, road conditions, and user preferences to determine the best path. Navigation systems using A* can provide users with efficient and personalized route recommendations, considering real-time updates and historical data.

3. Solving Puzzles and Games: A* is applied to solve various puzzles and games, such as the Rubik's Cube, 8-puzzle, and sliding block puzzles. In these scenarios, the algorithm searches for the optimal sequence of moves to reach the desired goal state. A* can find solutions efficiently by using heuristics that estimate the distance to the goal state, pruning unnecessary branches of the search tree.

4. Resource Allocation and Scheduling: A* is used in resource allocation and scheduling problems to find optimal solutions. For example, in job shop scheduling, A* can be employed to find the most efficient order of tasks to minimize total completion time or maximize resource utilization. It can also be applied to vehicle routing problems, where the goal is to find the optimal route for a fleet of vehicles to serve a set of customers while minimizing costs.

5. Natural Language Processing: A* has applications in natural language processing tasks, such as text summarization and machine translation. In text summarization, A* can be used to find the most informative sentences or phrases that best represent the main content of a document. In machine translation, A* can help find the most likely sequence of words in the target language that corresponds to the source language input.

6. Protein Structure Prediction: In bioinformatics, A* is applied to predict the three-dimensional structure of proteins. Protein structure prediction involves finding the lowest energy conformation of a protein molecule. A* can be used to search through the vast conformational space efficiently, using heuristics based on the physicochemical properties of amino acids and the overall energy of the protein structure.

7. Autonomous Vehicle Navigation: A* is utilized in autonomous vehicle navigation to plan optimal paths and make real-time decisions. Autonomous vehicles need to navigate through complex environments, considering factors such as traffic rules, obstacle avoidance, and safety constraints. A* can be used to find the best route for the vehicle, taking into account sensor data, map information, and dynamic obstacles in the environment.

Advantages of A* Search Algorithm

  1. Optimal and Complete: If the heuristic is admissible (never overestimates the true cost), A* is guaranteed to find the shortest path.
     
  2. Efficient: A* is more efficient than many other search algorithms because it directs its search towards the goal right from the start.
     
  3. Adaptable: The performance and behavior of A* can be adjusted by changing the heuristic function.
     
  4. Widely Used: A* is popular in many fields, including AI for games and robotics, due to its effectiveness and reliability.
     
  5. Informative: Provides detailed information about the pathfinding process, which can be useful for debugging and enhancing the understanding of the search.

Disadvantages of A* Search Algorithm

  1. Memory Usage: A* can consume a lot of memory because it must keep track of all generated nodes until the solution is found.
     
  2. Dependent on Heuristic: The choice of heuristic greatly affects the algorithm's performance. A poor heuristic can lead to slow performance.
     
  3. Overhead: Calculating the exact cost and maintaining a priority queue can add computational overhead.
     
  4. Complexity: Implementing A* can be complex, especially when determining the most effective heuristic for a particular problem.
     
  5. Sensitivity to Heuristic: A* is very sensitive to the heuristic used; a bad heuristic might cause it to behave like a less efficient algorithm, such as Dijkstra's algorithm.

Time and Space Complexity

Time Complexity

Best Case: 

Theoretically, the best case is O(d), where d is the depth of the shallowest goal, assuming the heuristic h(n) guides the algorithm linearly to the goal. However, this rarely happens unless the heuristic perfectly estimates the true cost.

Worst Case

O(bd) where d is the depth of the shallowest solution, and b is the branching factor. The worst case occurs when the heuristich(n) provides no useful information (e.g., h(n)=0 everywhere except at the goal), reducing A* to uniform-cost search.

Space Complexity

A* also has a space complexity of O(bd) because it keeps all generated nodes in memory. The space requirement is substantial as it maintains two lists: the open list and the closed list, tracking all the nodes that are yet to be explored as well as all the nodes that have been expanded.

Considerations 

  1. The heuristic used significantly affects both the time and space complexities of these algorithms. An ideal heuristic, which perfectly estimates the cost to reach the goal from any node, can drastically reduce the time and space complexities by effectively guiding the search toward the goal and reducing the need to explore irrelevant paths.
     
  2. In practical applications, A* tends to be more efficient than Best-first search due to its use of both g(n) (the cost from the start node to n) and h(n) (the estimated cost from n to the goal). This combination helps A* in avoiding exploration of paths that are cheaper initially but more expensive in the long run.

Frequently Asked Questions

What is the primary difference between Best-first Search and A Search?*

Best-first Search prioritizes nodes based solely on a heuristic that estimates the cost from the node to the goal. In contrast, A* Search combines this heuristic with the known cost from the start node to the current node, providing a more balanced evaluation of paths.

Can A Search algorithm work with any type of heuristic?*

Yes, A* can work with any heuristic, but for it to be optimal and efficient, the heuristic must be admissible, meaning it should never overestimate the true cost to reach the goal. Common examples include the Manhattan distance for grid maps and the Euclidean distance for direct paths.

Why might Best-first Search fail to find the shortest path?

Best-first Search might not find the shortest path because it focuses on what seems to be the best immediate option without considering the overall path cost. This greedy approach can lead to suboptimal paths if the initial choices don't lead to the best final outcomes.

Conclusion

In this article, we have learned the informed search algorithms, focusing on Best-first Search and A* Search. These algorithms are very important to solve complex search problems in various applications, from game development to route navigation. While Best-first Search is simpler and can be faster in certain scenarios, it does not guarantee an optimal path and is prone to inefficiencies. On the other hand, A* Search, though more complex, typically offers a more reliable and efficient solution by considering both past path costs and future estimates, making it a preferred choice in many practical applications. 

You can refer to our guided paths on the Coding Ninjas. You can check our course to learn more about DSADBMSCompetitive ProgrammingPythonJavaJavaScript, etc. Also, check out some of the Guided Paths on topics such as Data Structure andAlgorithmsCompetitive ProgrammingOperating SystemsComputer Networks, DBMSSystem Design, etc., as well as some Contests, Test Series, and Interview Experiences curated by top Industry Experts.

Live masterclass