Skip to content

Implementing a Graph Data Structure in Python

Updated: at 04:45 AM

A graph is a powerful and versatile data structure used to model connections and relationships between entities in various domains like networking, data science, and more. In this comprehensive guide, we will examine key concepts of graph theory, implement a graph class in Python from scratch with vertex and edge objects, and traverse the graph depth-first using recursion.

Table of Contents

Open Table of Contents

Introduction

A graph organizes items in an interconnected network, linking node points called vertices with edges. The connections modeled by graphs can have direction (directed graph) or lack direction (undirected graph). Graphs have widespread applications in areas like networks, data organization, recommendation systems, and more.

In Python, graphs can be implemented in various ways using dictionaries, lists, classes, or libraries like NetworkX. We will focus on building a custom graph class to gain a deeper understanding of graph data structures and traversals fundamentals.

Here are the key topics we will cover:

By the end of this guide, you will have hands-on experience constructing a graph data structure from scratch and implementing a foundational graph search algorithm in Python.

Graph Terminology and Representations

Let’s briefly review some key graph theory concepts before diving into the Python code:

There are different ways to represent a graph programmatically. Two common options are the adjacency matrix and adjacency list:

We will implement the adjacency list representation using Python dictionaries and objects.

Implementing a Graph Class in Python

Let’s model a directed, weighted graph in Python focused on simplicity and extensibility. We will create two classes - Vertex and Graph.

The Vertex class will represent each node in the graph, storing its unique ID.

The Graph class will manage the vertex and edge objects, and include methods to add vertices/edges and traverse the graph.

Vertex Class

First, we define the Vertex class which holds a unique ID to identify the vertex:

class Vertex:
  def __init__(self, vertex_id):
    self.id = vertex_id

This simple class serves as a basic wrapper representing each vertex in the graph.

Next we can implement the graph class using these vertex objects.

Graph Class

The Graph class will contain a dictionary to map vertex IDs to Vertex objects. It will also store the edges in a dictionary keyed by source vertex ID.

We initialize an empty vertices dictionary and edges dictionary in the constructor:

class Graph:
  def __init__(self):
    self.vertices = {}
    self.edges = {}

Next we implement the add_vertex method to add new vertex objects to the graph:

def add_vertex(self, vertex_id):
  if vertex_id not in self.vertices:
    self.vertices[vertex_id] = Vertex(vertex_id)

This first checks if that vertex ID already exists in the graph before creating a new Vertex object and adding to the vertices dictionary.

Similarly, we implement add_edge to add a weighted edge from source to destination vertex:

def add_edge(self, src_id, dest_id, weight=1):
  if src_id not in self.vertices:
    self.add_vertex(src_id)

  if dest_id not in self.vertices:
    self.add_vertex(dest_id)

  self.edges.setdefault(src_id, []).append((dest_id, weight))

This first ensures the source and destination vertices exist in the graph. Then it adds the edge to the edge dictionary, initializing the source vertex’s key with an empty list if needed.

With these components, our full graph class looks like:

class Vertex:
  def __init__(self, vertex_id):
    self.id = vertex_id

class Graph:
  def __init__(self):
    self.vertices = {}
    self.edges = {}

  def add_vertex(self, vertex_id):
    if vertex_id not in self.vertices:
      self.vertices[vertex_id] = Vertex(vertex_id)

  def add_edge(self, src_id, dest_id, weight=1):
    if src_id not in self.vertices:
      self.add_vertex(src_id)

    if dest_id not in self.vertices:
      self.add_vertex(dest_id)

    self.edges.setdefault(src_id, []).append((dest_id, weight))

This gives us a basic graph class to store vertices and weighted directed edges between them.

Next, we will look at traversing the graph depth-first using recursion.

Traversing a Graph Depth-First with Recursion

Depth-first search (DFS) is a fundamental graph traversal algorithm that explores vertices branch by branch down to the leaves before backtracking. It uses a stack (LIFO) to track the path.

We will implement a recursive DFS method on our graph class using visited sets and vertex coloring.

DFS Algorithm

The pseudo-code for a recursive depth-first search is:

DFS(graph, current_vertex):

  If current_vertex is visited:
    Return

  Mark current_vertex as visited

  For each neighbor in current_vertex's neighbors:
    DFS(graph, neighbor)

This recursively calls DFS on each neighboring vertex before backtracking. We will color vertices gray when first visited and black once fully explored.

Implement DFS in Python

Here is an implementation of DFS on our graph class:

WHITE = 0
GRAY = 1
BLACK = 2

def dfs(self, start_id):

  visited = set()
  self.dfs_helper(start_id, visited)

def dfs_helper(self, vertex_id, visited):
  visited.add(vertex_id)
  print(vertex_id)

  vertex = self.vertices[vertex_id]
  for neighbor_id in self.edges[vertex_id]:
    if neighbor_id not in visited:
      self.dfs_helper(neighbor_id, visited)

The dfs method calls a recursive helper, passing in the visited set. The helper marks the vertex as visited, prints it, then recursively visits all unvisited neighbors.

We can test this on a sample graph:

graph = Graph()

for id in range(5):
  graph.add_vertex(id)

graph.add_edge(0, 1)
graph.add_edge(1, 3)
graph.add_edge(2, 4)
graph.add_edge(3, 4)

graph.dfs(0)

This builds a sample graph then executes DFS starting at vertex 0, outputting:

0
1
3
4
2

The vertices are explored depth-first branch by branch!

Analysis

This DFS implementation visits each vertex and edge once, giving it a runtime of O(V + E) for V vertices and E edges.

The space complexity is O(V) to keep visited vertices in the set.

DFS is less memory intensive than breadth-first search and serves as a building block for many key graph algorithms.

Summary

In this guide, we learned graph terminology and representations before implementing a graph class in Python. Our graph uses a vertices dictionary and edges dictionary to store the nodes and connections.

We wrote add vertex and add edge methods to populate the graph. Then we traversed the graph depth-first recursively using visited sets and vertex coloring.

Key concepts covered include:

This foundational introduction can provide a great starting point for further graph algorithm study. Next steps could include expanding to analyze cycles, finding shortest paths, topological sorting, and minimum spanning trees.

There are also robust graph libraries like NetworkX we can build on in Python once familiar with the core graph concepts.

Graph data structures open up modeling many real-world systems from social networks to web links. Hopefully this guide provided a solid basis for implementing and traversing graphs in your own Python projects.