1. Introduction

In this tutorial, we’ll present and compare two search algorithms. Those are Uniform-Cost Search (UCS) and Best-First Search.

In a search problem, we want to find the shortest sequence of actions that transform the start state to a goal state. A state can be anything. For example, a point on a map or an arrangement of the pieces on a chessboard.

We determine the sequence by finding the shortest path between the start and the goal in an abstract graph whose nodes represent the states, and the directed edges are the actions leading from one state to another.  Searching through the graph, we create a search tree. Its nodes represent the paths to various states. The distinction between states and nodes in the tree is vital since multiple nodes may refer to the same state:

nodes vs states 2

We usually denote a node by the end state of the path it represents.

The edges connecting the nodes in the search graph may have weights that we also call costs. In those cases, we aren’t interested in the path containing the fewest edges between the start and the goal but in the one with the lowest cost.

If there are multiple-goal states, we want the best of all the paths that lead to any goal.

We use a Uniform-Cost Search (UCS) to find the lowest-cost path between the nodes representing the start and the goal states.

UCS is very similar to Breadth-First Search. When all the edges have equal costs, Breadth-First Search finds the optimal solution. However, if the costs differ, it may return a sub-optimal path:

bad bfs 3

UCS improves on Breadth-First Search by introducing three modifications.

First, it uses a priority queue as its frontier. It sorts the frontier nodes in the order of g, the cost of the path from the start node to the frontier nodes. When choosing a node for expansion, UCS selects from the frontier the one with the minimal value of \boldymbol{g}, i.e., the one with the lowest cost.

However, once we add a node to the frontier, it doesn’t mean that we know the cost of the optimal path to the node’s state. If we expand a node v and find that the path through v to its neighbor u costs less than g(w), where w is a frontier node that represents the same state as u, then we should remove w from the frontier and replace it with u. This queue update step is the second modification.

The third is applying the goal test after expanding the node, not when we put it in the frontier. If we test the nodes before storing them in the queue, we could return a sub-optimal path to the goal. Why? Because UCS can find a better path later in its execution.

Let’s take a look at the example of a sub-optimal path returned by Breadth-First Search. If we stop the search immediately after expanding Start and realizing that the goal node Goal is its neighbor, we’ll miss the optimal path which passes through A.

3.2. Pseudocode

So, here’s the pseudocode of Uniform-Cost Search:

Rendered by QuickLaTeX.com

The version presented here is the graph-search version of the algorithm because it checks if a node’s state is already in the frontier before adding it. A tree-like search doesn’t test for repeated states and is best suitable for the trees. However, we can still use it on graphs in general if we’re ok with the risk of longer execution or running infinitely due to cycles.

We assume that each node that the call to successors(v) produces is a structure containing a pointer to the v as its parent. So, once UCS returns a node, we can trace the path of actions easily.

3.3. Example

Let’s now show an example of UCS in action. We’ll use the same graph as before. In the beginning, only the start node is in the frontier, and its g-value is 0:

ucs 1-1

After expanding the start node, we get two nodes in the frontier, A and Goal, with the associated g-values of 1 and 11, respectively:

ucs 2-1

Since A has a lower g-value, we expand it and see that its successor Goal has a lower cost than the node, also representing Goal in the frontier. So, we replace the frontier Goal with the A‘s successor:

ucs 3-1

After expanding Goal, we see that it passes the goal test, so we conclude that the optimal path is Start{-}A{-}Goal.

4.  Analysis of UCS

We usually ask four questions about a search algorithm:

  1. Is it complete? An algorithm is complete if it terminates for any input.
  2. Is it optimal? We say that an algorithm is optimal if it returns the optimal path to the goal, provided that at least one goal state is reachable from the start state. If more than one state passes the goal test, we want the lowest-cost path among all the paths leading to any goal.
  3. What’s the algorithm’s time complexity?
  4. What’s its space complexity?

We answer those questions about UCS in this section.

4.1. Completeness of UCS

If the search graph is finite, then the graph-search version of UCS is complete. If the graph is infinite, but no node has an infinite number of neighbors, and all the edges have a positive cost, then the graph-search UCS will also be complete. The condition that edge costs be strictly positive guarantees that there won’t be an infinite path with zero-cost edges whose nodes UCS would keep expanding forever.

However, the tree-like UCS may get stuck in a loop even if the graph is finite. For example, that can happen if there are cycles.

4.2. Optimality of UCS

To prove the optimality of UCS, let’s first prove the following property:

Once UCS expands a node \boldsymbol{v}, it has found the optimal path from the start node to \boldsymbol{v.state}:

In other words, g(v)=c^*(v.state) at the point of expanding v, where c^*(v.state) is the cost of the optimal path to the state represented by v. (Technically, the nodes in the search tree represent the full paths, but it’s common to speak of the nodes as if they represent those paths’ end nodes.)

We’ll prove this property by induction on the cardinality of the set of expanded nodes, \boldsymbol{R}. We’ll make use of the fact that g(u) \geq c^*(u.state) for any node in the frontier. To see why it holds, let’s remember that we update the priority of any frontier state if we find a better path to it. So, g(u) is the upper bound of c^*(u.state). In the induction, we’ll prove that equality holds at the point when UCS pops the node from the frontier to expand it.

4.3. The Base Step and the Inductive Hypothesis

In the base step, we have |R|=1. It’s the case when UCS pops the start node s from the queue. By definition, g(s)=0, and that’s the cost of the optimal path from the start state to itself since we need not traverse any edge.

Now, we formulate the inductive hypothesis that this property holds for any \boldsymbol{R} such that \boldsymbol{|R| \leq n}. In the inductive step, we want to prove that it holds for any \boldsymbol{R \cup \{v\}}

By the inductive hypothesis, the property holds for R, so we only need to prove that it’s true for v as well. We’ll do it by contradiction.

4.4. The Inductive Step

So, this is the state of UCS at this point. We chose v to pop it from the frontier and expand it, and we’ve got a path to v.state whose cost is g(v):

ucs path 1

Now, the frontier separates the expanded nodes from the unexpanded ones. So, any path from the start node to an unexpanded node passes through a vertex in the frontier. With that in mind, let’s imagine an alternative path to v:

ucs path 2

and suppose that it’s the optimal path to v.state, so the blue path isn’t. Then, it holds:

(1)   \begin{equation*} g(v) > c^*(v.state) \end{equation*}

The last expanded node on the alternative path is u', and the first frontier node it passes through is u''. The part of the path from u'' to v may contain any number of edges. So, provided that the edges have non-negative costs, it must hold that the alternative path’s cost to v.state is at least c^*(u'.state)+c(u', u''):

(2)   \begin{equation*} c^*(v.state) \geq c^*(u'.state) + c(u', u'') \end{equation*}

4.5. Contradiction

Since the edge u'-u'' is on an optimal path, u' is expanded, and u'' is its neighbor, the g-value of u'' when we pop it will be equal to c^*(u'.state) + c(u', u''):

(3)   \begin{equation*} c^*(u'.state) + c(u', u'') = g(u'') \end{equation*}

Also, since UCS chose v for expansion, that means that v has the minimum g-value among all the nodes in the frontier, including u''. So:

(4)   \begin{equation*}  g(u'') \geq g(v) \end{equation*}

Chaining the formulae, we get:

(5)   \begin{equation*} \boldsymbol{g(v) >} c^*(v.state) \geq c^*(u'.state) + c(u', u'') = g(u'')  \geq \boldsymbol{g(v)} \end{equation*}

which is a contradiction. So, once we expand v, no other path to its state can have a cost lower than g(v). Therefore, g(v) = c^*(v.state). But, the condition is that all the edges have non-negative costs.

4.6. The Case with Multiple Goal Nodes

This means that once UCS expands a goal node, it’s found the optimal path to its state. However, there may be more than one goal state. Because the edge costs are non-negative, and UCS expands the frontier nodes in the order of their g-value, then any goal node expanded after the first goal node will be on the path that’s at least as costly as that of the first goal.

So, indeed, UCS is optimal and expands nodes in order of their states’ optimal path cost.

4.7. Time and Space Complexities of UCS

Let’s denote the minimal cost of an edge in the search graph as \varepsilon. Also, let C^* be the cost of the optimal path to any goal node. For UCS to be complete, we assume that \varepsilon >0.

Then, the depth in the search tree at which UCS finds the closest goal node is at most 1 + \left\lfloor \frac{C^*}{\varepsilon} \right\rfloor. If b is the upper bound of the branching factor, then the time complexity of the tree-like UCS is O\left(b^{1+\left\lfloor \frac{C^*}{\varepsilon} \right\rfloor}\right).

The same goes for the space complexity. Since the frontier may hold all the nodes at the depth of the closest goal, the space complexity is also O\left(b^{1+\left\lfloor \frac{C^*}{\varepsilon} \right\rfloor}\right).

When it comes to the graph-search UCS, its time complexity is bounded by the size of the graph. If V denotes the nodes, and E is the set of edges, then the worst-case time complexity of graph-search UCS is O(|V| + |E|). The space complexity is also O(|V| + |E|).

Best-First Search (BeFS) is a generic search algorithm. It has all the input arguments of UCS and one additional. The extra argument is an evaluation function f. BeFS uses it to order its frontier that’s implemented as a priority queue:

Rendered by QuickLaTeX.com

BeFS is a generic algorithm because we can change its search strategy by changing the evaluation function.

5.1. BeFS and Redundant Paths

The formulation presented in Algorithm 2 is the graph-search version of BeFS because it checks for repeated states. That way, BeFS avoids exploring redundant paths. For example, a path to node v is redundant if node u denotes the same state as v, but the path to u is shorter. However, to avoid redundant paths, graph-search BeFS requires a lot of memory.

If we can’t store all the nodes in the memory, we can use the tree-like BeFS. We obtain it by removing all the references to reached from Algorithm 2. However, we risk getting stuck in infinite cycles, which are special cases of redundant paths.

A compromise could be to check only for cycles. We can do that when we expand v in Algorithm 2. For each child w, we should check if w.state appears on the path to v. That can be done by following the pointers until we reach the start node. So, we won’t get stuck in a loop, but we may explore more paths than necessary.

5.2. Deriving Other Algorithms from BeFS

If we define f(u) as the cost of the path from the start node to u, then we get UCS. If we set f(u) to be the depth of u in the search tree, BeFS becomes Breadth-First Search. Could we get Depth-First Search (DFS)? Yes, we can, if we set f(u) to be the negative depth of u. However, DFS is usually implemented as a recursive algorithm.

We can get BeFS to simulated informed search strategies as well. For instance, if we use as f the heuristic estimate h of the distance to the goal, we get the Greedy Best-First Search (GBFS), which isn’t an optimal algorithm. But, if we combine the g of UCS with h of GBFS, we get the famous A* algorithm:

    \[f(u) = \underbrace{\text{the cost of the path from the start node to }u}_{g(u)} + \underbrace{\text{the estimate of the cost of the path from }u \text{ to a goal state}}_{h(u)}\]

5.3. Analysis of BeFS

The completeness, optimality, and complexity of BeFS depend on the choice of the evaluation function.

For example, we won’t get an optimal algorithm if we use only a heuristic to order the frontier. If we use the negative depth, we get DFS, which isn’t a complete algorithm. However, if we use the depth of a node as the evaluation function, we get Breadth-First Search, which is completely under certain conditions.

6. Discussion

Unlike UCS that is a fully specified uninformed algorithm, BeFS is a generic form of search method. Using different evaluation functions, BeFS can simulate numerous search techniques, both informed and informed. Without fully specifying the ordering strategy, we can’t run BeFS.

What makes such a profound difference is precisely the function \boldsymbol{f} used for ordering the frontier. In BeFS, f is an input parameter that we can vary. In contrast, we can’t change the way UCS sorts its queue.

Which is better? When should we use BeFS and when UCS? First of all, since UCS is an instance of BeFS, we’re technically executing BeFS when we run UCS. But, the BeFS formulation of UCS may be slower in practice since the frontier ordering function is an actual function that would add its frames to the recursion stack. The alternative is to implement UCS from scratch and without any reference to BeFS.

We can implement any BeFS strategy without calling the BeFS routine. Moreover, those implementations can be faster because ordering the frontier doesn’t involve calling a function of its own.

However, it’s easier to implement an algorithm by specifying only the function for the frontier instead of coding the entire algorithm. That’s certainly the case if we want to switch from one search strategy to another.

7. Conclusion

In this article, we talked about Uniform-Cost Search (UCS) and Best-First Search (BeFS). The former is a fully specified uninformed search strategy, whereas the latter is a generic algorithm that can simulate various search methods, both informed and uninformed. While we can simulate UCS with BeFS, native UCS implementations will usually execute faster.

Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.