Finding Lowest Common Ancestor in Binary Tree Python Guide

The lowest common ancestor (LCA) of two nodes in a binary tree is the lowest (deepest) node that has both nodes as descendants. Finding the LCA of two nodes has many practical applications in computer science, including optimizing network routing and analyzing hierarchical data structures. This comprehensive guide will explain the concept of the lowest common ancestor in binary trees and provide Python code examples to implement solutions for finding the LCA.

Open Table of Contents

Overview of Binary Trees
Lowest Common Ancestor Concept
Naive Approach in Python
Efficient Approach with Tree Traversal
Handling When LCA is One of the Nodes
LCA When Nodes are Not Present
Finding Distance from LCA
Summary

Overview of Binary Trees

A binary tree is a tree data structure where each node has at most two child nodes referred to as the left child and right child. The top-most node in the tree is called the root. In Python, a binary tree can be represented as a class with attributes for the data value, left child node, and right child node:

class Node:
    def __init__(self, val):
        self.val = val
        self.left = None
        self.right = None

We can construct a sample binary tree structure:

To traverse and search binary trees in Python, we can use recursive functions to traverse left and right child nodes. Common types of tree traversals include preorder, inorder, and postorder traversal.

Lowest Common Ancestor Concept

The lowest common ancestor (LCA) of two nodes n1 and n2 is defined as the lowest (deepest) node that has both n1 and n2 as descendants. Some key properties of the LCA include:

The LCA of n1 and n2 will be an ancestor of both n1 and n2 in the tree.
The LCA will have a path to both n1 and n2.
The LCA will be the deepest possible node that satisfies the above two properties.

To find the LCA of two nodes in a binary tree, we need to leverage tree traversal methods to search possible ancestor nodes while keeping track of the deepest or lowest matching ancestor.

Naive Approach in Python

A simple way to find the LCA is to traverse the binary tree starting from the root node. Check if either of the two given nodes n1 and n2 matches the current node. If there is a match, return the current node as the LCA.

Here is an example Python implementation:

# Function to find LCA of n1 and n2
def findLCA(root, n1, n2):
    # Base case
    if root is None:
        return None

    # Check if current node is n1 or n2
    if root.val == n1 or root.val == n2:
        return root

    # Recursively search left and right subtrees
    left = findLCA(root.left, n1, n2)
    right = findLCA(root.right, n1, n2)

    # If left and right both not null, current node is LCA
    if left and right:
        return root

    # Otherwise, return left or right subtree result
    return left if left else right

This naive approach works but is not optimal, with a worst-case time complexity of O(N) to search the full tree where N is the number of nodes. We can optimize this further.

Efficient Approach with Tree Traversal

A more efficient solution is to traverse the binary tree using tree traversal methods like inorder, preorder or postorder traversal. While traversing, if we come across the two given nodes, the current node must be the LCA.

We can stop traversing further down a subtree if both nodes are found as descendants. This prunes off portions of the tree from our search, improving efficiency.

Here is an example preorder traversal based Python implementation:

# Function to find LCA using preorder traversal
def findLCA(root, n1, n2):

  # Base case
  if root is None:
    return None

  # If either n1 or n2 matches current node, return the node
  if root.val == n1 or root.val == n2:
    return root

  # Recursively traverse left and right subtrees
  left_lca = findLCA(root.left, n1, n2)
  right_lca = findLCA(root.right, n1, n2)

  # If both recurse calls return a node, current node is LCA
  if left_lca and right_lca:
    return root

  # Otherwise, return the non-null left or right LCA node
  return left_lca if left_lca else right_lca

The time complexity is reduced to O(H) where H is the height of the binary tree. For a balanced tree, this is O(logN) which is much more efficient.

We traverse down until we find one of the nodes, then continue traversing to find the second node. The first match of the second node will be the lowest common ancestor.

Handling When LCA is One of the Nodes

In the above solutions, we assumed the LCA node itself cannot be equal to either n1 or n2. We need some small modifications to handle when the LCA is one of the nodes itself.

We can track if we have found n1 and n2 while traversing the tree. If both are found, and the current node is one of the two nodes, this node must be the LCA node.

# Track if n1 and n2 are found
n1_found = False
n2_found = False

# Modified preorder traversal based function
def findLCA(root, n1, n2):

  nonlocal n1_found, n2_found

  if root is None:
    return None

  # If current node is n1 or n2, mark as found
  if root.val == n1:
    n1_found = True
    return root if n2_found else None
  elif root.val == n2:
    n2_found = True
    return root if n1_found else None

  left_lca = findLCA(root.left, n1, n2)
  right_lca = findLCA(root.right, n1, n2)

  # If both nodes found, return current node
  if n1_found and n2_found or left_lca and right_lca:
    return root

  # Else return non-null left or right LCA
  return left_lca if left_lca else right_lca

This correctly handles when the LCA node is one of n1 or n2 itself.

LCA When Nodes are Not Present

To handle when one or both the nodes are not present in the tree, we can return None in those cases:

def findLCA(root, n1, n2):

  if root is None:
    return None

  # If n1 or n2 not found yet, recurse
  if not n1_found:
    left_lca = findLCA(root.left, n1, n2)
  if not n2_found:
    right_lca = findLCA(root.right, n1, n2)

  # If both nodes not present, return None
  if not n1_found and not n2_found:
    return None

  # Else return current node as LCA
  return root

This checks if n1 and n2 are found before recursing into subtrees. If any node is not found after full traversal, we can conclude that node is not present and return None.

Finding Distance from LCA

Once we find the LCA node, we can also find the distance of n1 and n2 from the LCA.

We traverse back up from n1 and n2 tracking the depth/distance. The sum of depths of n1 and n2 from LCA will be the distance between n1 and n2.

Here is an example implementation:

# Function to find distance of n1 and n2 from the LCA node
def findDistance(lca, n1, n2):

  d1 = getDepth(lca, n1)
  d2 = getDepth(lca, n2)

  return d1 + d2

# Get depth of node from LCA
def getDepth(lca, node):

  depth = 0
  while node != lca:
    depth += 1
    node = node.parent

  return depth

This allows us to calculate the distance between two nodes after finding the LCA node.

Summary

In this guide, we looked at different methods to find the lowest common ancestor of two given nodes in a binary tree using Python:

The naive approach traverses recursively checking each node as a potential LCA. This has a worst-case time complexity of O(N).
An efficient solution is to use tree traversal methods like preorder, inorder or postorder traversal. We traverse until both nodes are found, and the first match of the second node will be the LCA in O(H) time.
We handled cases where the LCA node itself is one of the two given nodes.
To handle nodes not being present, we can return None when either node is not found after full traversal.
Once the LCA node is found, we can also calculate the distance between n1 and n2 by summing their depths from the LCA node.

The lowest common ancestor is a fundamental concept for navigating and analyzing hierarchical tree structures with many useful applications in graph theory, networks, and bioinformatics. This guide provided various Python implementations to find the LCA in different scenarios.