A min-heap is a specialized tree-based data structure used to efficiently maintain the minimum value in a dataset while allowing for quick inserts and extracts. Min-heaps have widespread applications in algorithms and systems that rely on efficient access to minimum values, including Dijkstra’s algorithm, scheduling systems, and graph algorithms.
This how-to guide will provide Python developers with a comprehensive overview of min-heaps and detailed step-by-step instructions for implementing a min-heap class in Python. We will cover min-heap properties, use cases, time complexities, and walk through coding a MinHeap class with insert()
and extract_min()
methods. Relevant examples and best practices for Python coding style, documentation, and testing will be included.
By the end of this guide, readers should have a strong conceptual understanding of min-heaps and the skills to implement a fully functional min-heap in Python to integrate into their own projects and applications. The information presented aims to be accessible for Python developers at all skill levels.
Table of Contents
Open Table of Contents
Overview of Min-Heaps
A min-heap is a complete binary tree (a tree where all levels are filled except the last) that satisfies the heap property - each node is less than or equal to its child nodes. The root node of the min-heap contains the minimum value in the tree.
Properties
Min-heaps have several notable properties:
-
Complete binary tree structure: Min-heaps are complete binary trees, meaning they are filled row-by-row from left to right except the bottom row, which may be incomplete.
-
Heap property: The value of each node must be less than or equal to the value of its child nodes. The minimum value is always at the root.
-
Height: As a complete binary tree, the height (longest path from root to leaf) of a min-heap with N nodes is O(log N).
-
Inserts and extracts in O(log N) time: By maintaining the complete tree structure, inserts and extracts take O(log N) time.
Use Cases
Due to their structure, min-heaps are extremely useful for the following:
-
Priority queues: Heaps efficiently support priority queue operations, allowing minimum values to be extracted rapidly.
-
Graph algorithms: Min-heaps speed up algorithms like Dijkstra’s shortest path, which relies on extracting minimum values.
-
Scheduling systems: The minimum value can be extracted efficiently to determine the next task to schedule.
-
Sorting: Min-heaps can be used to heapsort a dataset by repeatedly extracting the min into a sorted array.
Min-Heap Operations
The two core operations used in a min-heap are:
Insert
To insert a new element into the min-heap:
-
Add the new element to the end of the heap array.
-
Compare the added element with its parent; if it is less than its parent, swap it.
-
Continue bubbling the element up, swapping with parents as necessary, until it is no longer less than its parent.
Extract Minimum
To extract the minimum value from the min-heap:
-
Save the minimum value from the root node.
-
Move the last element in the heap to the root.
-
Compare the new root with its children; if greater than either child, swap with the smaller child.
-
Continue sinking the new root downwards, swapping with smaller children, until the min-heap property is restored.
-
Return the saved minimum value.
Maintaining the min-heap structure through swaps ensures inserts and extracts take O(log N) time.
Implementing a MinHeap Class in Python
We will now walk through how to implement a MinHeap
class in Python containing insert
and extract_min
methods along with a _bubble_up
helper method.
MinHeap Class and Constructor
First we’ll define the overall MinHeap
class:
class MinHeap:
def __init__(self):
self.heap_list = [None]
self.count = 0
The constructor initializes the underlying heap list as a Python list containing a None
value at index 0. This allows for easier index calculations, since for any node at index k, its children are at 2*k+1
and 2*k+2
.
count
tracks the number of elements in the heap.
Insert Method
The insert
method adds a new element to the heap:
def insert(self, element):
self.count += 1
self.heap_list.append(element)
self._bubble_up(self.count)
It increments the count, appends the new element to the end of the heap list, and calls _bubble_up
to restore the heap structure.
Extract Minimum Method
The extract_min
method removes and returns the minimum element:
def extract_min(self):
if self.count == 0:
return None
min = self.heap_list[1]
self.heap_list[1] = self.heap_list[-1]
self.count -= 1
self.heap_list.pop()
self._bubble_down(1)
return min
It first handles the edge case of an empty heap. Otherwise, it saves the root minimum, replaces it with the last element in the heap, decrements the count, and removes the now-duplicated last element. Calling _bubble_down
on the new root restores the heap property.
_bubble_up Helper Method
_bubble_up
bubbles a new element up to restore the heap structure:
def _bubble_up(self, index):
parent = index // 2
if index <= 1:
return
if self.heap_list[index] < self.heap_list[parent]:
self.heap_list[index], self.heap_list[parent] = self.heap_list[parent], self.heap_list[index]
self._bubble_up(parent)
It swaps the element with its parent if it is less than its parent, and continues recursively bubbling up. Base cases are reaching index 1 or being greater than or equal to the parent.
_bubble_down Helper Method
_bubble_down
sinks a new root element downwards:
def _bubble_down(self, index):
left = 2 * index
right = 2 * index + 1
smallest = index
if left <= self.count and self.heap_list[left] < self.heap_list[index]:
smallest = left
if right <= self.count and self.heap_list[right] < self.heap_list[smallest]:
smallest = right
if smallest != index:
self.heap_list[index], self.heap_list[smallest] = self.heap_list[smallest], self.heap_list[index]
self._bubble_down(smallest)
It compares the root to its left and right children, swapping with the smaller child until the heap property is satisfied.
Full Implementation
class MinHeap:
def __init__(self):
self.heap_list = [None]
self.count = 0
def insert(self, element):
self.count += 1
self.heap_list.append(element)
self._bubble_up(self.count)
def extract_min(self):
if self.count == 0:
return None
min = self.heap_list[1]
self.heap_list[1] = self.heap_list[-1]
self.count -= 1
self.heap_list.pop()
self._bubble_down(1)
return min
def _bubble_up(self, index):
parent = index // 2
if index <= 1:
return
if self.heap_list[index] < self.heap_list[parent]:
self.heap_list[index], self.heap_list[parent] = self.heap_list[parent], self.heap_list[index]
self._bubble_up(parent)
def _bubble_down(self, index):
left = 2 * index
right = 2 * index + 1
smallest = index
if left <= self.count and self.heap_list[left] < self.heap_list[index]:
smallest = left
if right <= self.count and self.heap_list[right] < self.heap_list[smallest]:
smallest = right
if smallest != index:
self.heap_list[index], self.heap_list[smallest] = self.heap_list[smallest], self.heap_list[index]
self._bubble_down(smallest)
This provides a full implementation of insert
and extract_min
along with the helper methods to maintain the min-heap structure.
Example Usage
We can test our MinHeap
class:
min_heap = MinHeap()
min_heap.insert(5)
min_heap.insert(6)
min_heap.insert(2)
min_heap.insert(9)
min_heap.insert(3)
print(min_heap.extract_min()) # 2
print(min_heap.extract_min()) # 3
print(min_heap.extract_min()) # 5
This inserts 5 elements, extracts the minimums in order, and prints 2, 3, 5.
Time and Space Complexity
The time complexities of operations on a MinHeap with N elements are:
insert
- O(log N)extract_min
- O(log N)
At most log N swaps are required to bubble up or down to restore the heap structure.
The space complexity is O(N) since we store N heap elements.
Testing the Implementation
To properly test this implementation, we should write unit tests covering typical min-heap use cases and edge cases:
import unittest
from minheap import MinHeap
class TestMinHeap(unittest.TestCase):
def test_insert(self):
heap = MinHeap()
heap.insert(4)
self.assertEqual(heap.heap_list, [None, 4])
heap.insert(3)
heap.insert(10)
self.assertEqual(heap.heap_list, [None, 3, 10, 4])
heap.insert(1)
self.assertEqual(heap.heap_list, [None, 1, 3, 4, 10])
def test_extract_min(self):
heap = MinHeap()
heap.insert(5)
heap.insert(3)
heap.insert(7)
self.assertEqual(heap.extract_min(), 3)
self.assertEqual(len(heap.heap_list), 3)
self.assertEqual(heap.heap_list, [None, 5, 7])
def test_empty_heap(self):
heap = MinHeap()
self.assertIsNone(heap.extract_min())
if __name__ == '__main__':
unittest.main()
This includes test cases for inserts, extracts, and empty heaps. Running python test_minheap.py
would run these tests against our implementation.
Alternate Implementations
There are a few variations on min-heap implementations in Python:
-
Using a Python
list
as done here is simple but not cache-efficient. For larger heaps, andarray
improves cache performance. -
Can also implement with a class representing heap nodes, not just a list. Allows storing extra data per node.
-
The heap structure can also be represented using a binary tree class with references between nodes. May simplify visualization.
-
Python’s
heapq
module provides min-heap implementation in the standard library, but lacks a full class interface.
Applications of Min-Heaps
Some examples of how min-heaps are used in real systems:
-
Dijkstra’s Algorithm uses a min-heap to quickly extract the unvisited vertex with the smallest distance for each iteration. Significantly speeds up pathfinding.
-
Job schedulers like Apache Hadoop place jobs in a min-heap to always extract the shortest job to run next for optimal throughput.
-
Memory management can use a min-heap to find the smallest unused memory block to allocate on a request.
-
Graph algorithms use min-heaps as priority queues for primality testing, finding minimum spanning trees, and detecting shortest paths.
-
Sorting algorithms like heapsort use a min-heap to efficiently sort data.
Conclusion
This guide provided a comprehensive walkthrough of min-heaps and how to implement them in Python. We covered the key concepts, properties, operations, and applications of min-heaps. Detailed Python code samples, complexity analysis, testing techniques, and real-world examples were included to give developers a solid technical foundation as well as practical coding skills to build min-heaps. Readers should now have the understanding to integrate min-heaps into their own Python projects to efficiently access minimum values. There are many exciting applications of min-heaps across data structures, algorithms, and systems programming.