Linked lists are common data structures used in many algorithms and applications. They consist of nodes that contain data and a reference (or pointer) to the next node in the list. Unlike arrays, linked lists do not store data contiguously in memory, allowing for efficient insertion and removal of nodes.
A key question that arises with linked lists is whether they contain cycles, where a node’s next reference points back to an earlier node, creating a loop in the list. Detecting cycles is important for several reasons:
-
It can indicate a bug in code that constructs linked lists. Cycles can lead to infinite loops and memory leaks.
-
Certain algorithms require acyclic linked lists to function properly. Identifying cycles allows you to correctly handle them.
-
Knowing if a cycle exists allows you to determine the start of the cycle. This node’s reference mistakenly points backward instead of forward.
In this comprehensive Python programming guide, you will learn approaches to detect cycles in a linked list. The step-by-step instructions provided include example code and explanations of the key concepts. By the end, you will be able to implement efficient solutions to this common technical interview question using Python.
Table of Contents
Open Table of Contents
Prerequisites
To fully understand this guide, you should have:
- Knowledge of Python (variables, conditional logic, loops, functions, etc.)
- Understanding of object-oriented programming concepts like classes and objects
- Basic knowledge of linked list data structures and node connections
The examples will use a simple Node
class to represent each element in the linked list:
class Node:
def __init__(self, data):
self.data = data
self.next = None
The data
attribute stores the node’s data, while next
points to the following node.
Approach 1: Hash Table
The hash table approach keeps track of nodes seen while traversing the linked list. If a node is seen more than once, a cycle must exist. The steps are:
-
Initialize a hash table or set to store seen nodes.
-
Traverse the linked list from the head node.
-
For each node, check if it is in the hash table.
-
If not, add it and move to the next node.
-
If it already exists in the hash table, a cycle is present.
Here is Python code implementing this technique:
# Initialize hash table
seen = set()
current = head
while current:
# Check if node exists
if current in seen:
return True # Cycle detected
# Add to hash table
seen.add(current)
# Move to next node
current = current.next
return False # No cycle detected
The time complexity is O(N) since each node is visited at most once. The space complexity is O(N) for the hash table storing N nodes.
Hash Table Approach Pros and Cons
Pros:
-
Simple and intuitive solution
-
Constant time insertions and lookups in hash table
-
Linear time and space complexities
Cons:
- Requires extra storage for the hash table proportional to nodes in the list
Approach 2: Two Pointers
The two pointer technique checks for a cycle by using 2 pointers traversing the list at different speeds. If the linked list has a cycle, the pointers will eventually meet inside the cycle.
The steps for this approach are:
-
Initialize two pointers - slow and fast
-
Increment slow by 1 node, fast by 2 nodes
-
If fast encounters a null node, no cycle exists
-
If slow and fast meet, a cycle is present
-
To find the start node, increment a pointer from head and another from meeting node until they collide
Here is the Python implementation:
def has_cycle(head):
# Initialize slow and fast pointers
slow, fast = head, head
while fast and fast.next:
# Increment slow by 1, fast by 2
slow = slow.next
fast = fast.next.next
# Check if pointers meet
if slow == fast:
return True
# fast reached null, so no cycle
return False
# If cycle found, find start node by incrementing
# pointer p1 from head, and p2 from meeting point
def start_node(head):
# Check for cycle
if not has_cycle(head):
return None
# Initialize p1 from head, p2 from meeting point
p1 = head
p2 = head
# Increment p1 by 1, p2 by 1 until they collide
while p1 != p2:
p1 = p1.next
p2 = p2.next
# Return start node
return p1
The time complexity is O(N) to detect a cycle and O(N) to find the start node. Only two pointers are used, so the space complexity is O(1).
Two Pointer Approach Pros and Cons
Pros:
-
Faster than hash table approach - no extra data structure
-
Constant O(1) space usage
-
Can find start of cycle node
Cons:
-
More complex logic than hash table approach
-
Requires meeting point to find start node, adding extra steps
Approach 3: Set Node Values
This technique modifies node values as the list is traversed. The steps are:
-
Initialize variable value to 0
-
Traverse list and set node values to value
-
Increment value after setting each node
-
If a node with value already set is found, a loop exists
Here is Python code for this algorithm:
value = 0
current = head
while current:
# Cycle exists if node value
# already changed
if current.data == value:
return True
# Set node value
current.data = value
# Increment for next node
value += 1
current = current.next
return False # No cycle
By changing node values, we can detect if a node has been visited before. The time complexity is O(N) to visit each node once. The space used is O(1).
Set Values Approach Pros and Cons
Pros:
-
In-place algorithm, no extra memory needed
-
Simple concept
-
Linear time complexity
Cons:
-
Modifies the existing linked list
-
May not work if data stored cannot be modified
-
Slower than two pointer approach
Approach 4: Recursive Depth-First Search
This recursive algorithm traverses the graph depth-first, keeping track of visited nodes. If a node is visited more than once, there is a cycle.
Here are the steps:
-
Create a visited set to track visited nodes
-
Define recursive DFS function accepting current node
-
Mark node visited and recursively call DFS on adjacent unvisited nodes
-
If node was already visited, return True for cycle
-
Return False if all nodes are visited once
And here is a Python implementation:
visited = set()
def dfs(node):
# Visited before, cycle exists
if node in visited:
return True
# Add to visited
visited.add(node)
# Recursively visit adjacent nodes
for n in node.nexts:
if dfs(n):
return True
# No cycle detected
return False
# Start at head node
return dfs(head)
The time complexity is O(N) since each node is visited once. The space used for the visited set is O(N).
Recursive DFS Pros and Cons
Pros:
-
Intuitive recursion-based cycle detection
-
Able to detect cycle in any graph, not just linked lists
Cons:
-
Extra recursive function call overhead
-
Recursive stack space usage
-
Slower than iterative approaches
Comparing Approaches
Approach | Time Complexity | Space Complexity | Finds Cycle Start | In-place |
---|---|---|---|---|
Hash Table | O(N) | O(N) | No | No |
Two Pointers | O(N) | O(1) | Yes | Yes |
Set Values | O(N) | O(1) | No | Yes |
Recursive DFS | O(N) | O(N) | No | Yes |
-
Hash table is simple but requires extra storage.
-
Two pointers is fast and finds cycle start with minimal overhead.
-
Set values directly modifies the list without other storage.
-
Recursive DFS is flexible for any graph but has recursion costs.
So in summary, the two pointer approach provides the best combination of low overhead, good performance, and ability to find the cycle start node.
Applications
Detecting linked list cycles has many useful applications:
-
Debugging - Finding cycles can reveal bugs in linked list code and prevent infinite loops or crashes.
-
System Design - Certain data structures like thread pools, file systems, and kernels use linked lists and require cycle detection.
-
Web Crawling - Crawler programs that traverse website link structures can leverage cycle detection to avoid getting stuck in loops.
-
Garbage Collection - Cycles prevent memory from being reclaimed. Cycle detection is useful for garbage collectors.
-
Network Routing - Routing protocols use complex graph structures susceptible to cycles, so identifying them improves performance and reliability.
Conclusion
This guide covered 4 techniques to determine if a cycle exists in a linked list using Python:
-
Hash table - Simple approach tracking nodes in a hash table during traversal.
-
Two pointers - Fast algorithm with constant space by iterating 2 pointers at different speeds.
-
Set values - Modify node values during traversal to detect visited nodes.
-
Recursive DFS - Leverage depth-first search and visited set to identify cycles.
The two pointer approach provides an optimal combination of speed, low overhead, and ability to find the node where the cycle originates. Linked list cycle detection is an important technique for writing robust code and improving applications like debugging, garbage collection, web crawling, and more.
By mastering these algorithms in Python and understanding their trade-offs, you will be prepared to efficiently handle this common technical interview question and tackle linked list cycle scenarios in your own programs.