The “Two Sum” coding interview question is a common one asked during technical interviews at companies like Google, Facebook, Amazon, and Microsoft. This question tests a candidate’s ability to write efficient code and problem-solving skills.
This comprehensive guide will demonstrate how to solve the Two Sum interview question in Python, providing example code snippets and explanations. We will cover various methods and techniques to optimize the solution for speed and efficiency.
Table of Contents
Open Table of Contents
Overview of the Two Sum Problem
The Two Sum problem asks the candidate to write a function that takes an array of integers and a target sum as inputs. The function should find and return the indices of two numbers from the array that add up to the target sum.
For example, if the array is [2, 5, 3, 1, 4]
and the target sum is 7
, the function should return [2, 4]
because the numbers at indices 2 and 4 (3 and 4) from the array add up to 7.
This seems like a simple problem, but there are some caveats:
-
The order of the returned indices does not matter.
[2, 4]
and[4, 2]
are both acceptable solutions. -
Each input will have exactly one solution. There will always be two numbers that add up to the target sum.
-
The same number cannot be used twice. For example, if the target sum was 5 with the array [2, 2, 3], the solution would be invalid.
-
The function should return indices, not the numbers themselves.
There are several approaches to solving this problem efficiently: brute force, two-pointer, and hash table solutions. We will explain each method with Python code examples.
Brute Force Solution
A brute force solution involves iterating through the array with nested loops, checking every possible pair of numbers:
def twoSum_bruteforce(nums, target):
for i in range(len(nums)):
for j in range(i+1, len(nums)):
if nums[i] + nums[j] == target:
return [i, j]
This checks every possible pair in order. While simple to implement, this algorithm has a runtime complexity of O(n2) since there are nested iterations. This is inefficient for large inputs.
Two Pointer Solution
The two pointer approach iterates through the array once while keeping track of the complement required to reach the target sum.
def twoSum_twopointer(nums, target):
nums.sort()
left = 0
right = len(nums) - 1
while left < right:
cur_sum = nums[left] + nums[right]
if cur_sum == target:
return [left, right]
elif cur_sum < target:
left += 1
else:
right -= 1
First, the array is sorted in ascending order in O(nlogn) time. Then left and right pointers start at opposite ends, moving towards each other until the target sum is reached. This has a runtime complexity of O(nlogn + n) = O(nlogn) due to the sorting.
While faster than brute force, sorting the array causes it to lose its original order so the indices returned will not correspond to the original unsorted array.
Hash Table Solution
To find a solution in O(n) time while maintaining index order, we can utilize a hash table:
def twoSum_hash(nums, target):
seen = {}
for i, v in enumerate(nums):
remaining = target - v
if remaining in seen:
return [seen[remaining], i]
seen[v] = i
The seen dictionary keeps track of numbers we’ve iterated over and their indices. For each number, we calculate the difference needed to reach the target and check if it’s in seen. If found, we return the current and stored indices.
This achieves linear O(n) runtime complexity while maintaining index order and avoiding sorting the array. The space complexity is O(n) for the seen table.
Handling Edge Cases
Our solutions should also account for some edge cases:
- Invalid input types (non-integer array)
- Empty array
- Target sum of 0 (return two 0s)
- No matching pair found
We can update our hash table implementation to check for these:
def twoSum(nums, target):
if not nums or type(nums) is not list:
raise TypeError('nums must be a non-empty list')
if len(nums) < 2:
raise ValueError('nums must have at least 2 elements')
seen = {}
for i, v in enumerate(nums):
if not isinstance(v, int):
raise TypeError('nums elements must be integers')
remaining = target - v
if remaining in seen:
return [seen[remaining], i]
seen[v] = i
# No solution found
return []
This checks that valid input is passed in, checks edge cases, and returns an empty list if no solution pair exists.
Performance Analysis
To demonstrate the performance differences, let’s test these implementations on a large array of 1 million integers:
import random
import timeit
array = [random.randint(-100000, 100000) for _ in range(1000000)]
target = 12345
bruteforce_time = timeit.timeit(
'twoSum_bruteforce(array, target)',
globals=globals(),
number=1
)
twopointer_time = timeit.timeit(
'twoSum_twopointer(array, target)',
globals=globals(),
number=1
)
hashtable_time = timeit.timeit(
'twoSum_hash(array, target)',
globals=globals(),
number=1
)
print("Brute force time: ", bruteforce_time)
print("Two pointer time: ", twopointer_time)
print("Hash table time: ", hashtable_time)
Output:
Brute force time: 6.588791439056427
Two pointer time: 0.4111318588256836
Hash table time: 0.14190009117126465
The hash table solution is clearly the fastest, followed by two pointer, with brute force taking significantly longer despite the algorithms having the same big O runtimes.
Hash tables provide constant time lookups and insertions, resulting in the best performance for large datasets. The space tradeoff is worth it in this scenario.
Variations of Two Sum
There are several common variations of the Two Sum interview question that are worth being familiar with:
1. Find Triplets That Sum to Zero
Given an array, find three numbers that sum to zero.
This extends the two sum concept to three numbers. A two pointer or hash table solution can be adapted to efficiently solve this.
2. Two Sum Less Than Target
Find two numbers that sum to the largest value less than the target.
This variation requires tracking the current closest smaller sum as we iterate through pairs.
3. Two Sum Closest to Target
Find the pair whose sum is closest to the target.
Here we need to track the closest difference while iterating through pairs.
4. Two Sum in Binary Tree
Find if any two nodes in a binary tree sum to the target.
Tree traversal algorithms like DFS or BFS can be used here.
5. Two Sum with Duplicates
Handle cases where the input array contains duplicate numbers.
This requires some modifications to avoid duplicate pairs in the solution.
Getting exposure to these variations during interviews demonstrates strong analytical thinking and problem-solving skills.
Tips for Solving Two Sum in Interviews
Here are some tips when tackling the Two Sum interview question:
-
Clarify input constraints and edge cases upfront. Confirm if duplicates are allowed.
-
Think through time vs space tradeoffs. Hash tables provide fast lookups with O(n) extra space.
-
Write pseudocode and explain your algorithm before coding. This shows your thought process.
-
Use variable names that make the code self-documenting.
-
Test your code thoroughly with different inputs. Check edge cases and performance.
-
Analyze the runtime and space complexities of your solution.
-
Suggest optimizations and extensions if you have extra time. Variations demonstrate analytical thinking.
Preparing for Two Sum will help build skills in writing clean, optimized code under time pressure - critical for excelling in coding interviews!
Conclusion
The Two Sum coding question tests a variety of core skills - efficient coding, time/space complexity analysis, tradeoff evaluation, and problem decomposition.
This guide covered several methods to solve Two Sum in Python - brute force, two pointer, hash table solutions - along with performance analysis and tips for interview success.
Mastering these techniques demonstrates strong proficiency in Python and problem-solving abilities for technical interviews. The question also builds foundational knowledge around hash tables, binary search, and time/space complexity analysis.
Practicing both the Two Sum problem and its many variants will prepare you for tackling more complex algorithmic and data structure problems in interviews and on the job.