1. Overview

In this tutorial, we’ll discuss the problem of finding a cycle in a singly linked list and the starting point of this cycle.

First, we’ll explain the general idea of the problem and then discuss two approaches to solving it. Secondly, we’ll present two exceptional cases and see how can one of the approaches be updated to handle them.

In the end, we’ll present a comparison between all the provided approaches.

2. Defining the Problem

Suppose we have a linked list A, which has n nodes. We want to check if the list is cyclic or not. Also, we want to find the beginning of this cycle, if found.

Before we start with the solutions, we need to know what the cycle in a list means.

As we know, the list is a collection of nodes in which each node contains a reference (in other words, a link) to the next node. We say that the list is cyclic when the last node points to one of the nodes in the list

Let’s check the example for better understanding:

Cyclic Linked List 1

 

Note that we can have only one cycle inside a list. Besides, the cause of the cycle will always be that the last node is pointing to a node inside the list because each node points to exactly one node.

3. Visited Approach

3.1. General Idea

In the simple list with no cycles, we can iterate over its elements using while loop as follows:

algorithm iterateList(A):
    // INPUT
    //   A = The linked list

    iter <- A.head
    while iter != null:
        // ...
        iter <- iter.next

The reason is that, since we don’t have a cycle, the last element points to null. Take a look at the example that shows an acyclic list:

Acyclic Linked List

However, when we have a cycle, we can’t iterate using the same way. The reason is that the last node points to another one. Therefore, the algorithm will enter an infinite loop.

To solve this issue, we can add a variable to each node that tells us whether we’ve visited it before or not. Let’s call it \boldsymbol{visited}.

3.2. Implementation

Take a look at the implementation of the visited approach:

algorithm findCycleVisited(A):
    // INPUT
    //   A = The linked list
    // OUTPUT
    //   Returns the starting node of the cycle if it exists, or null otherwise

    answer <- null
    node <- A.head

    while node != null:
        if node.visited = true:
            answer <- node
            break
        node.visited <- true
        node <- node.next

    node <- A.head
    while node != null:
        if not node.visited:
            break
        node.visited <- false
        node <- node.next

    return answer

The implementation consists of two main steps.

In the first step, we check whether there is a cycle in A. At first, we initialize answer by null. This is so that if we can’t find any loop, the answer remains null.

Next, we iterate over the nodes in A using the iterator node. In each step, we check whether we visited the current node before. If so, it means we reached a cycle, and the current node is the starting point. Therefore, we store node inside answer and stop the loop.

In case we haven’t visited this node before, we mark it as visited and continue to the next element.

Once we’re done with the first step, we mark all nodes as unvisited to restore the list to its original state. This is to allow us to use this variable in other algorithms.

Finally, we return the resulting answer.

The complexity of the visited approach is \boldsymbol{O(n)}, where n is the length of the list.

4. Special Cases for the Visited Approach

In some problems or program language, it’s tough to edit the structure of the nodes. As a result, we might not be able to add the visited variable. However, we might have a unique variable for each element. Let’s call it id.

4.1. Small Integer IDs

In case \boldsymbol{id} is an integer in the range \boldsymbol{[0, MaxVal]}, where \boldsymbol{MaxVal} is relatively small, we can add a new boolean array with \boldsymbol{MaxVal+1} elements. In each step, we check the value of \boldsymbol{visited[node.id]}, rather than checking the value inside the element.

Let’s take a look at the implementation:

algorithm findCycleSmallIntegers(A):
    // INPUT
    //   A = The linked list
    // OUTPUT
    //   Returns the starting node of the cycle if it exists, or null otherwise

    answer <- null
    node <- A.head
    visited <- array of false values

    while node != null:
        if visited[node.id] = true:
            answer <- node
            break
        visited[node.id] <- true
        node <- node.next

    return answer

Firstly, we initialize answer with null, which will hold the resulting answer. Also, we’ll initialize the array visited with false values.

Secondly, we’ll iterate over A. For each node, we check whether we’ve visited it before or not. To do this, we use the visited array.

If we haven’t visited it before, we assign true to its location inside visited and move to the next. Otherwise, we update answer and break the loop.

Finally, we return the resulting answer.

Note that if we have negative values, such that the id’s range is [-a,+b], we can still use this approach after shifting all values by a. Hence, the id’s range will be [0,a+b] and the position of id x will be x+a.

The complexity of this approach is \boldsymbol{O(n+MaxVal)}, where n is the size of the list, and MaxVal is the maximum value of the ids.

4.2. TreeSet Approach

In case the id is not an integer or has a relatively large value, we can’t use an array. Thus, we solve this case by using either a TreeSet or a HashSet data structure. In this tutorial, we’ll use a TreeSet to store the visited elements.

Let’s take a look at the implementation:

algorithm findCycleTreeSet(A):
    // INPUT
    //   A = The linked list
    // OUTPUT
    //   Returns the starting node of the cycle if it exists, or null otherwise

    answer <- null
    node <- A.head
    set <- empty TreeSet

    while node != null:
        if set.contains(node.id):
            answer <- node
            break
        set.add(node.id)
        node <- node.next

    return answer

Similarly to the previous approach, we initialize answer with null, which will hold the resulting answer. Besides that, we define set as a new empty TreeSet that will hold the visited ids.

Same as before, we’ll use the iterator node to iterate over A. However, in this approach, we’ll use the \boldsymbol{set} data structure to check whether we reach a visited node.

If we don’t find the id inside the set, we move to the next element and add it to set. Otherwise, we update the answer and break the loop.

Finally, we return the answer.

The complexity of this approach is \boldsymbol{O(n \times log(n))}, where n is the number of elements.

5. Two-Pointers Approach

In this section, we’ll explain a different idea to finding a cycle in a list.

5.1. General Idea

First of all, let’s define two iterators node1 and node2 initializing them with A.head.

Next, we iterate over \boldsymbol{A}. Each time, we move \boldsymbol{node1} one step, while we move \boldsymbol{node2} two steps. We continue this process until we either reach \boldsymbol{null} or \boldsymbol{node1} becomes eqaul to \boldsymbol{node2}.

If we reach null, we declare that no cycle is found.  On the other hand, if node1 becomes equal to node2, then we reached a loop and node1 points to one of its nodes.

To find the start of the cycle, we must find the size of the loop first. We can do this by moving node1 only until it becomes equal to node2 again and counting the steps. Hence, we get the size of the cycle denoted as m.

Now, let’s start \boldsymbol{node1} from the beginning of \boldsymbol{A} , and \boldsymbol{node2} from the \boldsymbol{m^{th}} node. We keep moving \boldsymbol{node1} and \boldsymbol{node2} one step until they meet. At this point we can either return node1 or node2 as the beginning of the cycle.

5.2. Implementation

Let’s check the implementation:

algorithm findCycleTwoPointers(A):
    // INPUT
    //   A = The linked list
    // OUTPUT
    //   Returns the starting node of the cycle if it exists, or null otherwise

    answer <- null
    node1 <- A.head
    node2 <- A.head
    foundLoop <- false

    while node1 != null and node2 != null and node2.next != null:
        if node1 = node2:
            foundLoop <- true
            break
        node1 <- node1.next
        node2 <- node2.next.next

    if not foundLoop:
        return null

    m <- 1
    node1 <- node1.next
    while node1 != node2:
        node1 <- node1.next
        m <- m + 1

    node1 <- A.head
    node2 <- A.head
    for j <- 1 to m:
        node2 <- node2.next

    while node1 != node2:
        node1 <- node1.next
        node2 <- node2.next

    return node1

We can split this implementation into four parts:

  1. We check whether we have a cycle or not. This is done by moving node1 one step, and node2 two steps each time. The process continues until they meet. When they meet, we declare finding a cycle. After the loop ends, we check whether we found a loop. If so, we return null and terminate the algorithm.
  2. We calculate the size of the cycle of m. We move node1 until it meets with node2 again. Note that we move the node1 one time before the loop because, at first, they are equal.
  3. Assign A.head to node1 and move node2 until it points to the m^{th} node.
  4. Move node1 and node2 together until they meet. Once they meet, return any of them.

The complexity of this approach is \boldsymbol{O(n)}, where n is the size of the list. Let’s explain the reason behind the linear complexity.

5.3. Two-Iterators Proof of Concept

First of all, let’s explain the idea behind the two iterators. Since node2 is moving twice as fast as node1, then once node2 will be the first one to enter the cycle. After this, node1 will enter the cycle.

At this point, we can be sure that \boldsymbol{node2} will catch up to \boldsymbol{node1}, after at most \boldsymbol{m} steps, because it’s moving faster.  Before this happens, we have multiple cases:

  1. If they meet, we must have a cycle because node2 came back to meet node1.
  2. If \boldsymbol{node2} is one step before \boldsymbol{node1}, then in the next step they will meet. Hence, we have a cycle.
  3. In case node2 is two steps before node1, then in the next step, node2 will be one step before node1, which is similar to case 2.

In general, since \boldsymbol{node2} moves two steps forward, the distance between both iterators will reduce by one step each time until they finally meet.

As a result, we can conclude that if we have a cycle, then node2 will catch up to node1, and they’ll meet at the same location after no more than n steps.

5.4. Cycle Starting Node Proof of Concept

To get the beginning of the cycle, we started node1 from the beginning of A and node2 from the m^{th} node. Let’s prove that they must meet after no more than n steps.

First of all, let’s define the distance as the number of steps needed for node1 to reach node2. In the beginning, the distance equals m, which is the size of the cycle.

Since both iterators move at the same speed, this distance will remain the same. Also, \boldsymbol{node2} will be the first to enter the cycle, followed by \boldsymbol{node1} after \boldsymbol{m} steps. When this happens, suppose \boldsymbol{node1} still needs \boldsymbol{m} steps to reach \boldsymbol{node2}.

However, since both of them are inside the cycle, if \boldsymbol{node1} moves \boldsymbol{m} steps forward, it’ll reach its current location. The reason is that the cycle has m elements. So, moving m steps from a node will get us back to the same node.

As a result, the only way that node2 is still in front of node1 by m steps is if they’re at the same node. Hence, this node is the starting node. Also, node1 needs at most n steps to reach the beginning of the cycle.

6. Comparison

Take a look at the differences between previous approaches:

Rendered by QuickLaTeX.com

By memory complexity, we mean the additional needed memory regardless of the memory already occupied by the list.

As we can see, the visited approach is simpler to implement and easier to understand. But, to use it, we need to add the visited variable to the nodes of the linked list. Hence, the O(n) memory complexity.

On the other hand, when we have small integer ids inside the list, it’s better to use the small integers approach.

Regarding the TreeSet approach, its complexity is bigger than other approaches. Thus, it’s usually not the best one to go with.

A two-pointers approach is a general approach that can handle any case. Therefore, we can always use it.

Nevertheless, it takes a little more time than the visited approach because we iterate over each element about six times.

7. Conclusion

In this tutorial, we discussed finding a cycle in a singly linked list and the starting point of this cycle.

Firstly, we explained the general idea of the problem and discussed two approaches that solve it. Besides that, we presented two special cases related to the first one.

In the end, we showed a comparison between all the approaches in terms of time and memory complexity. Also, we discussed the best practices for using each of them.

Comments are closed on this article!