Skip to content

How to Find the First Non-Repeating Character in a String in Python

Updated: at 05:45 AM

Finding the first non-repeating character in a string is a common technical interview question for Python developers and data scientists. The goal is to write a function that takes a string as input and returns the first character that does not repeat anywhere in the string.

This question tests several skills including string manipulation, hash tables/dicts, set theory, and algorithmic thinking. While it may seem simple, there are multiple ways to approach this problem each with their own time and space complexities.

In this comprehensive guide, we will cover the following topics:

Table of Contents

Open Table of Contents

Problem Statement

Given an input string, find and return the first character that does not repeat anywhere in the string. The input string will only contain lowercase English alphabet letters. If all characters repeat, return None.

Example:

Input: "aabbccddee"
Output: 'c'

Example:

Input: "statistics"
Output: 't'

Brute Force Approach

A brute force approach to solve this problem would check each character in the string against the rest of the string to see if it repeats.

def firstUniqChar(string):
  for i in range(len(string)):
    char = string[i]
    if string.count(char) == 1:
      return char
  return None

We loop through each character in the input string and count occurrences using string.count(char). If count is 1, we return that character. Otherwise, we return None.

Time Complexity: O(N^2) since for each character we count occurrences by traversing whole string.

Space Complexity: O(1) since only additional storage is for loop variables.

This brute force method is quite inefficient as it requires nested looping through the string. We can optimize this using dictionaries.

Approach Using Dictionaries

To improve efficiency, we can store character counts in a dictionary as we traverse the string once. Then loop through again to find the first unique character.

def firstUniqChar(string):
  charCounts = {}

  for char in string:
    if char not in charCounts:
      charCounts[char] = 1
    else:
      charCounts[char] += 1

  for i in range(len(string)):
    char = string[i]
    if charCounts[char] == 1:
      return char

  return None

We store characters and their respective counts in a dictionary. In the second loop, we retrieve counts in constant time and return first character with count 1.

Time Complexity: O(N) since we traverse the string twice but don’t have nested loops.

Space Complexity: O(N) for storing N character counts in the dictionary.

This optimized approach reduces time complexity while requiring more space. Next, we’ll explore a set approach.

Approach Using Sets

We can further optimize this by storing only non-repeating characters in a set, since we only need one of them.

def firstUniqChar(string):

  charsSeen = set()
  uniqueChars = set()

  for char in string:
    if char not in charsSeen:
      uniqueChars.add(char)
      charsSeen.add(char)
    elif char in uniqueChars:
      uniqueChars.remove(char)

  return uniqueChars.pop() if uniqueChars else None

We use one set to track characters seen and another for unique characters. When a repeating character is found, we remove it from the unique set. Finally, we return the popped character from the unique set if it exists.

Time Complexity: O(N) single pass through the string.

Space Complexity: O(N) to store N unique characters in the set.

The set approach optimizes space usage while maintaining O(N) time complexity.

Approach Using collections Counter

Python’s built-in collections.Counter class can also help solve this efficiently:

from collections import Counter

def firstUniqChar(string):
  counts = Counter(string)

  for i, char in enumerate(string):
    if counts[char] == 1:
      return char
  return None

Counter provides character frequencies in constant time. We loop through the string and check counts to find the first unique char.

Time Complexity: O(N) loop through string twice.

Space Complexity: O(N) to store N character counts in Counter.

Leveraging Python’s data structures like Counter can lead to clean, efficient code.

Approach Using List Comprehension

We can also utilize Python’s list comprehension to write a compact one-liner solution:

def firstUniqChar(string):
  return next((char for char in string if string.count(char) == 1), None)

This iterates through the characters to find the first one with count 1 and returns None if none found.

Time Complexity: O(N^2) due to string.count() in each iteration.

Space Complexity: O(1)

While concise, this approach has worst case quadratic time complexity.

Approach Using itertools

Python’s itertools module provides useful generators that can also solve this problem:

from itertools import filterfalse

def firstUniqChar(string):
  charCounts = filterfalse(lambda c: string.count(c) > 1, string)
  return next(charCounts, None)

filterfalse() creates a generator that filters out repeating characters. We return the first item or None.

Time Complexity: O(N)

Space Complexity: O(1)

This approach efficiently leverages Python generators without extra storage.

Performance Analysis

Here is a summary of the time and space complexities of each approach:

ApproachTime ComplexitySpace Complexity
Brute ForceO(N^2)O(1)
DictionaryO(N)O(N)
SetO(N)O(N)
CounterO(N)O(N)
List ComprehensionO(N^2)O(1)
itertoolsO(N)O(1)

The dictionary, set, Counter, and itertools solutions have the optimal O(N) time complexity.

Set and itertools approaches are most space efficient using O(N) and O(1) extra space respectively.

Examples and Sample Code

Here are some examples illustrating how to call the different implementations discussed:

Dictionary Approach

print(firstUniqChar("aabbccddee")) # c
print(firstUniqChar("statistics")) # t

Set Approach

print(firstUniqChar("aabbccddee")) # c
print(firstUniqChar("statistics")) # t

Counter Approach

from collections import Counter

print(firstUniqChar("aabbccddee")) # c
print(firstUniqChar("statistics")) # t

List Comprehension Approach

print(firstUniqChar("aabbccddee")) # c
print(firstUniqChar("statistics")) # t

itertools Approach

from itertools import filterfalse

print(firstUniqChar("aabbccddee")) # c
print(firstUniqChar("statistics")) # t

In summary, the dictionary, set, Counter, and itertools solutions provide the optimal balance of time and space complexities for this problem.

Practical Applications

This pattern of finding the first unique element has several practical applications:

For example, to analyze word frequencies in a large text corpus, we would need to identify the unique words efficiently similar to the techniques discussed.

Suggestions for Python Developers

Here are some key takeaways for Python developers and data scientists:

This question provides great practice applying core Python, algorithms, and problem solving principles. Hope this comprehensive guide helps you master finding first non-repeating characters in Python!