Skip to content

Solving the Two Sum Interview Question in Python

Updated: at 03:23 AM

The “Two Sum” coding interview question is a common one asked during technical interviews at companies like Google, Facebook, Amazon, and Microsoft. This question tests a candidate’s ability to write efficient code and problem-solving skills.

This comprehensive guide will demonstrate how to solve the Two Sum interview question in Python, providing example code snippets and explanations. We will cover various methods and techniques to optimize the solution for speed and efficiency.

Table of Contents

Open Table of Contents

Overview of the Two Sum Problem

The Two Sum problem asks the candidate to write a function that takes an array of integers and a target sum as inputs. The function should find and return the indices of two numbers from the array that add up to the target sum.

For example, if the array is [2, 5, 3, 1, 4] and the target sum is 7, the function should return [2, 4] because the numbers at indices 2 and 4 (3 and 4) from the array add up to 7.

This seems like a simple problem, but there are some caveats:

There are several approaches to solving this problem efficiently: brute force, two-pointer, and hash table solutions. We will explain each method with Python code examples.

Brute Force Solution

A brute force solution involves iterating through the array with nested loops, checking every possible pair of numbers:

def twoSum_bruteforce(nums, target):
    for i in range(len(nums)):
        for j in range(i+1, len(nums)):
            if nums[i] + nums[j] == target:
                return [i, j]

This checks every possible pair in order. While simple to implement, this algorithm has a runtime complexity of O(n2) since there are nested iterations. This is inefficient for large inputs.

Two Pointer Solution

The two pointer approach iterates through the array once while keeping track of the complement required to reach the target sum.

def twoSum_twopointer(nums, target):
    nums.sort()
    left = 0
    right = len(nums) - 1

    while left < right:
        cur_sum = nums[left] + nums[right]
        if cur_sum == target:
            return [left, right]
        elif cur_sum < target:
            left += 1
        else:
            right -= 1

First, the array is sorted in ascending order in O(nlogn) time. Then left and right pointers start at opposite ends, moving towards each other until the target sum is reached. This has a runtime complexity of O(nlogn + n) = O(nlogn) due to the sorting.

While faster than brute force, sorting the array causes it to lose its original order so the indices returned will not correspond to the original unsorted array.

Hash Table Solution

To find a solution in O(n) time while maintaining index order, we can utilize a hash table:

def twoSum_hash(nums, target):
    seen = {}

    for i, v in enumerate(nums):
        remaining = target - v
        if remaining in seen:
            return [seen[remaining], i]
        seen[v] = i

The seen dictionary keeps track of numbers we’ve iterated over and their indices. For each number, we calculate the difference needed to reach the target and check if it’s in seen. If found, we return the current and stored indices.

This achieves linear O(n) runtime complexity while maintaining index order and avoiding sorting the array. The space complexity is O(n) for the seen table.

Handling Edge Cases

Our solutions should also account for some edge cases:

We can update our hash table implementation to check for these:

def twoSum(nums, target):
    if not nums or type(nums) is not list:
        raise TypeError('nums must be a non-empty list')

    if len(nums) < 2:
        raise ValueError('nums must have at least 2 elements')

    seen = {}
    for i, v in enumerate(nums):
        if not isinstance(v, int):
            raise TypeError('nums elements must be integers')

        remaining = target - v

        if remaining in seen:
            return [seen[remaining], i]
        seen[v] = i

    # No solution found
    return []

This checks that valid input is passed in, checks edge cases, and returns an empty list if no solution pair exists.

Performance Analysis

To demonstrate the performance differences, let’s test these implementations on a large array of 1 million integers:

import random
import timeit

array = [random.randint(-100000, 100000) for _ in range(1000000)]
target = 12345

bruteforce_time = timeit.timeit(
    'twoSum_bruteforce(array, target)',
    globals=globals(),
    number=1
)

twopointer_time = timeit.timeit(
    'twoSum_twopointer(array, target)',
    globals=globals(),
    number=1
)

hashtable_time = timeit.timeit(
    'twoSum_hash(array, target)',
    globals=globals(),
    number=1
)

print("Brute force time: ", bruteforce_time)
print("Two pointer time: ", twopointer_time)
print("Hash table time: ", hashtable_time)

Output:

Brute force time:  6.588791439056427
Two pointer time: 0.4111318588256836
Hash table time: 0.14190009117126465

The hash table solution is clearly the fastest, followed by two pointer, with brute force taking significantly longer despite the algorithms having the same big O runtimes.

Hash tables provide constant time lookups and insertions, resulting in the best performance for large datasets. The space tradeoff is worth it in this scenario.

Variations of Two Sum

There are several common variations of the Two Sum interview question that are worth being familiar with:

1. Find Triplets That Sum to Zero

Given an array, find three numbers that sum to zero.

This extends the two sum concept to three numbers. A two pointer or hash table solution can be adapted to efficiently solve this.

2. Two Sum Less Than Target

Find two numbers that sum to the largest value less than the target.

This variation requires tracking the current closest smaller sum as we iterate through pairs.

3. Two Sum Closest to Target

Find the pair whose sum is closest to the target.

Here we need to track the closest difference while iterating through pairs.

4. Two Sum in Binary Tree

Find if any two nodes in a binary tree sum to the target.

Tree traversal algorithms like DFS or BFS can be used here.

5. Two Sum with Duplicates

Handle cases where the input array contains duplicate numbers.

This requires some modifications to avoid duplicate pairs in the solution.

Getting exposure to these variations during interviews demonstrates strong analytical thinking and problem-solving skills.

Tips for Solving Two Sum in Interviews

Here are some tips when tackling the Two Sum interview question:

Preparing for Two Sum will help build skills in writing clean, optimized code under time pressure - critical for excelling in coding interviews!

Conclusion

The Two Sum coding question tests a variety of core skills - efficient coding, time/space complexity analysis, tradeoff evaluation, and problem decomposition.

This guide covered several methods to solve Two Sum in Python - brute force, two pointer, hash table solutions - along with performance analysis and tips for interview success.

Mastering these techniques demonstrates strong proficiency in Python and problem-solving abilities for technical interviews. The question also builds foundational knowledge around hash tables, binary search, and time/space complexity analysis.

Practicing both the Two Sum problem and its many variants will prepare you for tackling more complex algorithmic and data structure problems in interviews and on the job.