Skip to content

Merging Overlapping Intervals in Python: Solutions, Code Examples, Complexity Analysis

Updated: at 02:01 AM

Merging overlapping intervals is a common coding interview question that assesses a candidate’s skills in algorithms and data structures. This programming task involves taking a collection of intervals, identifying any intervals that overlap, and merging them into a single continuous interval.

In Python, intervals can be represented as tuples or lists containing a start and end value. The key steps are to sort the intervals by start time, iterate through the sorted intervals, compare overlaps, and combine overlapping intervals. While conceptually simple, implementing an optimal solution that runs efficiently on large inputs requires knowledge of algorithms like sorting, binary search, and computational geometry.

This comprehensive guide will walk through techniques to merge intervals in Python. We will cover the problem statement, illustrate with examples, present multiple solutions using built-in functions and algorithms, analyze time and space complexities, include runnable example code snippets with explanations, and discuss test cases. Following this tutorial will provide Python programmers the knowledge to handle interval merging questions confidently in coding interviews and applications dealing with schedules, calendars, reservations, etc.

Table of Contents

Open Table of Contents

Overview of the Problem

The interval merging problem can be formally stated as:

Given a collection of intervals, merge all overlapping intervals to produce a set of mutually exclusive intervals.

For example:

Input: [[1,4], [2,5], [7,9]]
Output: [[1,5], [7,9]]
# Since [1,4] and [2,5] overlap, they are merged into [1,5]

The key aspects are:

This problem has applications in calendar scheduling, data visualization, and CPU task scheduling. Before diving into solutions, let’s look at example test cases to clarify the requirements.

Example Test Cases

Input: [[1,4], [2,6], [8,10], [15,18]]
Output: [[1,6], [8,10], [15,18]]

Input: [[1,4], [2,3]]
Output: [[1,4]]

Input: [[1,10], [2,6], [3,5], [7,9]]
Output: [[1,10]]

Input: [[2,3], [4,5], [6,7], [8,9], [1,10]]
Output: [[1,10]]

Key Observations

These test cases illustrate the core requirements - handle unsorted intervals, merge all types of overlaps, and return mutually exclusive intervals. This forms the basis for developing correct and efficient solutions, which we will explore next.

Solution 1: Using Built-in Functions

Python’s built-in sorted(), min(), and max() functions can directly solve this problem concisely:

def merge_intervals(intervals):

  sorted_intervals = sorted(intervals, key=lambda x: x[0]) # Sort by start time

  merged = []

  for interval in sorted_intervals:
    if not merged or merged[-1][1] < interval[0]: # No overlap
      merged.append(interval)
    else: # Overlap, merge intervals
      merged[-1][1] = max(merged[-1][1], interval[1])

  return merged

How it Works:

  1. Sort the intervals by start time using sorted() and a lambda key function.

  2. Iterate through the sorted intervals.

  3. Compare each interval with the last interval in the merged list:

    • If no overlap, directly append the interval to merged.
    • If overlap, update the end time of last interval in merged to the max of the end times.
  4. Return the final merged intervals list.

This leverages Python’s simple yet powerful built-in functions to solve interval merging in a clean, readable manner without explicitly writing sorting or searching logic.

Time Complexity: O(n log n)
Sorting takes O(n log n) time. Overall complexity is dominated by the sorting step.

Space Complexity: O(n) The merged list contains at most n intervals, so space used is linear O(n).

Solution 2: Sort and Scan

We can also implement the sort-and-scan solution manually without built-in functions:

def merge_intervals(intervals):

  intervals.sort(key=lambda x: x[0]) # Custom sort on start time

  merged = []
  for interval in intervals:
    if not merged or merged[-1][1] < interval[0]:
      merged.append(interval)
    else:
      merged[-1][1] = max(merged[-1][1], interval[1])

  return merged

This follows the same overall approach - sort on start time, scan linearly to find/merge overlaps. The key differences are:

The time and space complexity remains unchanged at O(n log n) and O(n) respectively. This illustrates writing a sorting algorithm manually without built-in functions.

Solution 3: Sweep Line Algorithm

The sweep line algorithm is a common technique used in computational geometry problems like interval scheduling. It utilizes the following steps:

  1. Sort intervals by start time.
  2. Initialize sweep line position to the start of first interval.
  3. Iterate through intervals:
    • If current interval start > sweep line position, no overlap. Move sweep line position to start of current interval.
    • If overlap, merge intervals by updating end to max end of overlapping intervals.
  4. Return merged intervals.
def merge_intervals(intervals):

  intervals.sort(key=lambda x : x[0])

  merged = []
  sweep_line = intervals[0][0]

  for interval in intervals:
    start = interval[0]
    end = interval[1]

    if start > sweep_line: # No overlap
      merged.append(interval)
      sweep_line = start
    else: # Overlap, merge intervals
      merged[-1][1] = max(merged[-1][1], end)

  return merged

This demonstrates the sweep line algorithm which avoids explicitly tracking overlap logic. The runtime remains O(n log n) due to sort, but the scanning steps are simplified.

Solution 4: Using a Binary Search Tree

A self-balancing Binary Search Tree (BST) can also efficiently solve the interval merging problem in O(n log n) time.

The steps are:

  1. Insert each interval sorted by start time into BST.
  2. Traverse BST in order and if current node overlaps with previous, merge them.
  3. Return merged intervals from BST.
from bst import BSTNode, BST

# BST Node class
class IntervalNode(BSTNode):
  def __init__(self, interval, left=None, right=None):
    self.interval = interval
    super().__init__(interval[0], left, right)

# BST class
class IntervalBST(BST):

  def insert(self, interval):
    self.root = self._insert(self.root, IntervalNode(interval))

  def traverse(self):
    intervals = []
    self._inorder(self.root, intervals)
    return intervals

  def _inorder(self, root, intervals):
    if root:
      self._inorder(root.left, intervals)
      intervals.append(root.interval)
      self._inorder(root.right, intervals)

  def merge_overlaps(self):
    return self._merge(self.traverse())

  def _merge(self, intervals):
    merged = []
    for interval in intervals:
      # Merge overlap logic

    return merged

def merge_intervals(intervals):

  bt = IntervalBST()

  for interval in intervals:
    bt.insert(interval)

  return bt.merge_overlaps()

This leverages the BST structure to efficiently sort, search, and merge intervals in one traversal. The self-balancing property also guarantees O(log n) operations. For coding interviews, this shows expertise in complex data structures like BSTs.

Solution 5: Using a Segment Tree

A segment tree is a specialized tree data structure used for interval related queries like merging, searching, etc. It follows these steps:

  1. Build the segment tree from the input intervals.
  2. Use tree intervals to efficiently merge overlapping intervals.
  3. Return merged intervals from tree.
class SegmentNode:

  def __init__(self, start, end):
    self.start = start
    self.end = end
    self.left = None
    self.right = None

# Build tree
def build(intervals):

  # Recursively build tree
  def build_tree(start, end):
    if start > end:
      return None

    mid = start + (end - start) // 2
    left = build_tree(start, mid)
    right = build_tree(mid + 1, end)
    return SegmentNode(start, end, left, right)

  return build_tree(0, len(intervals))

# Merge overlaps
def merge(root):

  # Recursive merge
  def merge_recur(root):
    if not root:
      return

    merge_recur(root.left)
    merge_recur(root.right)

    # Merge logic if root overlaps with previous

  return root

def merge_intervals(intervals):

  tree = build(intervals)
  merged = merge(tree)

  return merged

Segment trees provide O(log n) query times. This is an optimal and efficient data structure for interval related problems. In interviews, this displays strong data structure knowledge.

Solution Analysis

SolutionTime ComplexitySpace Complexity
Built-in FunctionsO(n log n)O(n)
Sort and ScanO(n log n)O(n)
Sweep LineO(n log n)O(n)
Binary Search TreeO(n log n)O(n)
Segment TreeO(n log n)O(n)

So in summary, for most cases, the built-in or sort-and-scan solutions provide the best balance of simplicity and efficiency. BST or Segment tree solutions are only needed for advanced use cases dealing with many real-time interval queries.

Tips and Tricks

Here are some additional tips for solving interval merge problems efficiently:

Summary

This guide covered a variety of techniques to merge overlapping intervals in Python:

The key ideas are to sort upfront by start interval, scan linearly to find/merge overlaps, and return a final set of mutually exclusive merged intervals.

Interval manipulation problems frequently appear in coding interviews and calendar/scheduling applications. I hope this detailed walkthrough provides a deeper understanding of intervals in Python along with the confidence to tackle related algorithm questions. Please leave any questions in the comments!