1. Introduction

Directed graphs are usually used in real-life applications to represent a set of dependencies. For example, a course pre-requisite in a class schedule can be represented using directed graphs.

And cycles in this kind of graph will mean deadlock — in other words, it means that to do the first task, we wait for the second task, and to do the second task, we wait for the first. So, detecting cycles is extremely important to avoid this situation in any application.

2. Problem Explanation

In a directed graph, we’d like to find cycles. These cycles can be as simple as one vertex connected to itself or two vertices connected as shown:

 

Or, we can have a bigger cycle like the one shown in the next example, where the cycle is B-C-D-E:

 

Also, we can get graphs with multiple cycles intersecting with each other as shown (we have three cycles: A-B-C-E-D, B-C-E-D, and E-D-F):

We should also notice that in all previous examples, we can find all cycles if we traverse the graphs starting from any node. However, this isn’t true in all graphs.

In some graphs, we need to start visiting the graph from different points to find all cycles as in the graph, as shown in the following example (Cycles are C-D-E and G-H):

 

 

3. Algorithm Idea

Finding cycles in a simple graph as in the first two examples in our article is simple. We can traverse all the vertices and check if any of them is connected to itself or connected to another vertex that is connected to it. But, of course, this isn’t going to work in more complex scenarios.

So, one famous method to find cycles is using Depth-First-Search (DFS). By traversing a graph using DFS, we get something called DFS Trees. The DFS Tree is mainly a reordering of graph vertices and edges. And, after building the DFS trees, we have the edges classified as tree edges, forward edges, back edges, and cross edges.

Let’s see an example based on our second graph:

We can notice that the edge E-B is marked as a back edge in the previous example. A back edge is an edge that is connecting one of the vertices to one of its parents in the DFS Tree.

In the previous example, we also have the edge A-C marked as a forward edge. A forward edge is an edge connecting a parent to one of the non-direct children in the DFS tree.

Notice that we also have the normal edges in solid lines in our graph. These solid line edges are the tree edges. The Tree edges are defined as the edges that are the main edges visited to make the DFS tree. 

Let’s see another example where we have multiple back edges:

Notice that with each back edge, we can identify cycles in our graph. So, if we remove the back edges in our graph, we can have a DAG (Directed Acyclic Graph).

Now, let’s see one final example to illustrate another issue we might face:

In this last example, we can see an edge marked as a cross edge.

To understand this part, we need to think of the DFS over the given graph. If we start the DFS on this graph starting from vertex A, we’ll visit the vertices A, C, D, E, and F.

Still, we’ll not see the vertices B, G, and H. So, in this case, we need to restart the DFS after the first run from a different point that was not visited, such as the vertex B. As a result of these multiple DFS runs, we’ll have multiple DFS Trees.

To conclude the idea in this example, we can have multiple DFS trees for a single graph. And the cross edge is in our example is the edge connecting vertex from one of the DFS trees to another DFS tree in our graph.

A cross edge can also be within the same DFS tree if it doesn’t connect a parent and child (an edge between different branches of the tree).

However, in this tutorial, we’re mainly interested in the back edges of the DFS tree as they’re the indication for cycles in the directed graphs.

4. Flow-Chart

We can illustrate the main idea simply as a DFS problem with some modifications. The first thing is to make sure that we repeat the DFS from each unvisited vertex in the graph, as shown in the following flow-chart:

 

The second part is the DFS processing itself. In this part, we need to make sure we have access to what is in the stack of the DFS to be able to check for the back edges.

And whenever we find a vertex that is connected to some vertex in the stack, we know we’ve found a cycle (or multiple cycles):

 

If our goal is to print the first cycle, we can use the illustrated flow-chart to print the cycle using the DFS stack and a temporary stack:

However, if our goal is to convert the graph to an acyclic graph, then we should not print the cycles (as printing all cycles is an NP-Hard problem).

Instead, we should mark all the back edges found in our graph and remove them.

5. Pseudocode

Our next part of this tutorial is a simple pseudocode for detecting cycles in a directed graph.

In this algorithm, the input is a directed graph. For simplicity, we can assume that it’s using an adjacency list.

The first function is an iterative function that reads the graph and creates a list of flags for the graph vertices (called visited in this pseudocode) that are initially marked as NOT_VISITED.

Then, the function iterates over all the vertices and calls the function processDFSTree when it finds a new NOT_VISITED vertex:

Rendered by QuickLaTeX.com

The processDFSTree function is a recursive function that takes three inputs: the graph, the visited list, and a stack. After that, the function is mainly doing DFS, but it also marks the found vertices as IN_STACK when they’re first found.

And after processing the vertex, we mark it as DONE. In case we find a vertex that is IN_STACK, we’ve found a back edge, which indicates that we’ve found a cycle (or cycles).

So, we can print the cycle or mark the back edge:

Rendered by QuickLaTeX.com

Finally, let’s write the function that prints the cycle:

Rendered by QuickLaTeX.com

The printing function mainly needs the stack and the vertex that was found with the back edge. Then, the function pops the elements from the stack and prints them, then pushes them back, using a temporary stack.

6. Complexity

The time complexity of the given algorithm is mainly the DFS, which is O(|V| + |E|).

As for the space-complexitywe need to allocate space for marking the vertices and for the stack, so it’ll be in the order of O(|V|).

7. Conclusion

In this tutorial, we covered one of the algorithms to detect cycles in directed graphs. At first, we discussed one of the important applications for this algorithm. Then, we explained the idea and showed the general algorithm idea using examples, flow-charts, and pseudocode.

Finally, we showed an average space and time complexity for the algorithm.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments