A graph is a powerful and versatile data structure used to model connections and relationships between entities in various domains like networking, data science, and more. In this comprehensive guide, we will examine key concepts of graph theory, implement a graph class in Python from scratch with vertex and edge objects, and traverse the graph depth-first using recursion.
Table of Contents
Open Table of Contents
Introduction
A graph organizes items in an interconnected network, linking node points called vertices with edges. The connections modeled by graphs can have direction (directed graph) or lack direction (undirected graph). Graphs have widespread applications in areas like networks, data organization, recommendation systems, and more.
In Python, graphs can be implemented in various ways using dictionaries, lists, classes, or libraries like NetworkX. We will focus on building a custom graph class to gain a deeper understanding of graph data structures and traversals fundamentals.
Here are the key topics we will cover:
- Graph terminology and representations
- Implementing a graph class in Python
- Vertex objects
- Edge objects
- Add vertex and add edge methods
- Traversing a graph depth-first
- Recursive DFS traversal
By the end of this guide, you will have hands-on experience constructing a graph data structure from scratch and implementing a foundational graph search algorithm in Python.
Graph Terminology and Representations
Let’s briefly review some key graph theory concepts before diving into the Python code:
- Vertex - Also called a node, this is a fundamental unit representing an entity in a graph.
- Edge - An edge denotes a connection or relationship between two vertices. It can be one-way or two-way.
- Degree - The degree of a vertex is the number of edges connected to it.
- Path - A path is a sequence of vertices connected by edges.
- Directed vs. Undirected - Edges can be directed (one-way relationship) or undirected (bi-directional relationship).
- Weighted Graphs - Edges can have an associated weight or cost.
- Cyclic vs. Acyclic - Cyclic graphs contain cycles (paths that end at the starting vertex). Acyclic graphs have no cycles.
There are different ways to represent a graph programmatically. Two common options are the adjacency matrix and adjacency list:
- Adjacency matrix - A V x V matrix (V = number of vertices) where the entry at row i and column j indicates if there is an edge from vertex i to vertex j.
- Adjacency list - An array or hash table of linked lists representing each vertex’s adjacent edges. More space efficient for sparse graphs.
We will implement the adjacency list representation using Python dictionaries and objects.
Implementing a Graph Class in Python
Let’s model a directed, weighted graph in Python focused on simplicity and extensibility. We will create two classes - Vertex
and Graph
.
The Vertex
class will represent each node in the graph, storing its unique ID.
The Graph
class will manage the vertex and edge objects, and include methods to add vertices/edges and traverse the graph.
Vertex Class
First, we define the Vertex
class which holds a unique ID to identify the vertex:
class Vertex:
def __init__(self, vertex_id):
self.id = vertex_id
This simple class serves as a basic wrapper representing each vertex in the graph.
Next we can implement the graph class using these vertex objects.
Graph Class
The Graph
class will contain a dictionary to map vertex IDs to Vertex
objects. It will also store the edges in a dictionary keyed by source vertex ID.
We initialize an empty vertices
dictionary and edges
dictionary in the constructor:
class Graph:
def __init__(self):
self.vertices = {}
self.edges = {}
Next we implement the add_vertex
method to add new vertex objects to the graph:
def add_vertex(self, vertex_id):
if vertex_id not in self.vertices:
self.vertices[vertex_id] = Vertex(vertex_id)
This first checks if that vertex ID already exists in the graph before creating a new Vertex
object and adding to the vertices dictionary.
Similarly, we implement add_edge
to add a weighted edge from source to destination vertex:
def add_edge(self, src_id, dest_id, weight=1):
if src_id not in self.vertices:
self.add_vertex(src_id)
if dest_id not in self.vertices:
self.add_vertex(dest_id)
self.edges.setdefault(src_id, []).append((dest_id, weight))
This first ensures the source and destination vertices exist in the graph. Then it adds the edge to the edge dictionary, initializing the source vertex’s key with an empty list if needed.
With these components, our full graph class looks like:
class Vertex:
def __init__(self, vertex_id):
self.id = vertex_id
class Graph:
def __init__(self):
self.vertices = {}
self.edges = {}
def add_vertex(self, vertex_id):
if vertex_id not in self.vertices:
self.vertices[vertex_id] = Vertex(vertex_id)
def add_edge(self, src_id, dest_id, weight=1):
if src_id not in self.vertices:
self.add_vertex(src_id)
if dest_id not in self.vertices:
self.add_vertex(dest_id)
self.edges.setdefault(src_id, []).append((dest_id, weight))
This gives us a basic graph class to store vertices and weighted directed edges between them.
Next, we will look at traversing the graph depth-first using recursion.
Traversing a Graph Depth-First with Recursion
Depth-first search (DFS) is a fundamental graph traversal algorithm that explores vertices branch by branch down to the leaves before backtracking. It uses a stack (LIFO) to track the path.
We will implement a recursive DFS method on our graph class using visited sets and vertex coloring.
DFS Algorithm
The pseudo-code for a recursive depth-first search is:
DFS(graph, current_vertex):
If current_vertex is visited:
Return
Mark current_vertex as visited
For each neighbor in current_vertex's neighbors:
DFS(graph, neighbor)
This recursively calls DFS on each neighboring vertex before backtracking. We will color vertices gray when first visited and black once fully explored.
Implement DFS in Python
Here is an implementation of DFS on our graph class:
WHITE = 0
GRAY = 1
BLACK = 2
def dfs(self, start_id):
visited = set()
self.dfs_helper(start_id, visited)
def dfs_helper(self, vertex_id, visited):
visited.add(vertex_id)
print(vertex_id)
vertex = self.vertices[vertex_id]
for neighbor_id in self.edges[vertex_id]:
if neighbor_id not in visited:
self.dfs_helper(neighbor_id, visited)
The dfs
method calls a recursive helper, passing in the visited
set. The helper marks the vertex as visited, prints it, then recursively visits all unvisited neighbors.
We can test this on a sample graph:
graph = Graph()
for id in range(5):
graph.add_vertex(id)
graph.add_edge(0, 1)
graph.add_edge(1, 3)
graph.add_edge(2, 4)
graph.add_edge(3, 4)
graph.dfs(0)
This builds a sample graph then executes DFS starting at vertex 0, outputting:
0
1
3
4
2
The vertices are explored depth-first branch by branch!
Analysis
This DFS implementation visits each vertex and edge once, giving it a runtime of O(V + E) for V vertices and E edges.
The space complexity is O(V) to keep visited vertices in the set.
DFS is less memory intensive than breadth-first search and serves as a building block for many key graph algorithms.
Summary
In this guide, we learned graph terminology and representations before implementing a graph class in Python. Our graph uses a vertices dictionary and edges dictionary to store the nodes and connections.
We wrote add vertex and add edge methods to populate the graph. Then we traversed the graph depth-first recursively using visited sets and vertex coloring.
Key concepts covered include:
- Graph basics like directed vs undirected, cyclic vs acyclic
- Adjacency list representation using Python dicts and objects
- Vertex and graph classes to encapsulate the data
- Adding vertices and weighted directed edges
- Recursive depth-first search traversal withvisited sets
This foundational introduction can provide a great starting point for further graph algorithm study. Next steps could include expanding to analyze cycles, finding shortest paths, topological sorting, and minimum spanning trees.
There are also robust graph libraries like NetworkX we can build on in Python once familiar with the core graph concepts.
Graph data structures open up modeling many real-world systems from social networks to web links. Hopefully this guide provided a solid basis for implementing and traversing graphs in your own Python projects.