Sets are an important data type in Python that can help optimize code and solve certain problems where regular lists are not as well suited. A set is an unordered collection of unique elements that provides high-speed lookup, removal, and membership testing. Sets can be created in two ways - using curly braces {}
or the built-in set()
constructor.
This comprehensive guide will explain what sets are, why they are useful, and provide step-by-step instructions on how to create, initialize, and manipulate sets using clear examples. We will cover creating empty sets, initializing sets with values, set operations like union, intersection, difference, and symmetric difference, along with common methods and functions like adding and removing elements from a set.
Table of Contents
Open Table of Contents
- What is a Set in Python?
- Creating a Set Using Curly Braces
- Creating a Set Using the set() Constructor
- Initializing an Empty Set
- Adding Elements to a Set
- Removing Elements from a Set
- Set Union
- Set Intersection
- Set Difference
- Set Symmetric Difference
- Checking Subsets and Supersets
- Set Methods Summary
- Set Membership Testing
- Unhashable Elements Cannot be Stored in Sets
- Sets vs Lists and Tuples
- Conclusion
What is a Set in Python?
A set in Python is a data structure that contains an unordered collection of unique and immutable objects. The major characteristics of a set are:
- Sets are unordered - elements do not have an index.
- Set elements must be unique. Duplicate elements are automatically removed.
- Sets are mutable - elements can be added or removed after creation.
- Sets can be used to perform mathematical set operations like union, intersection, symmetric difference, etc.
Sets are useful when the presence of an object matters, but the order does not. They provide fast membership testing with in
and not in
operators and are much faster than lists for removing duplicate elements from a sequence.
Some common uses of sets in Python include:
- Removing duplicates from a list.
- Membership testing - checking if an element is part of a set.
- Mathematical operations like intersections, unions, differences etc.
- Fast lookups even with large data sets.
Creating a Set Using Curly Braces
The simplest way to create a set in Python is to use curly braces {}
. Elements within the curly braces will make up the set.
For example:
numbers = {1, 2, 3, 4}
print(numbers)
# Output: {1, 2, 3, 4}
This creates a set named numbers
containing the elements 1, 2, 3,
and 4
.
Note that the set is unordered, so the elements are printed in a random order. Also, duplicate elements are automatically removed:
numbers = {1, 2, 1, 3, 4}
print(numbers)
# Output: {1, 2, 3, 4}
We can also directly create empty sets like this:
empty_set = {} # Does NOT create a set
empty_set = set() # Correct method to initialize an empty set
Important: Using {}
alone will create an empty dictionary in Python, not an empty set. To create an empty set, use set()
.
Curly braces can be used to create sets with any immutable Python data types like integers, floats, strings, tuples etc. For example:
# Set of strings
languages = {"Python", "Java", "C++"}
# Set of integers
nums = {1, 2, 3, 4, 5}
# Set of tuples
tuples_set = {(1,2), (3,4), (5,6)}
# Set cannot contain mutable objects like lists
# This will cause a TypeError
# list_set = {[1,2], [3,4], [5,6]}
Creating a Set Using the set() Constructor
The set()
constructor can also be used to create Python sets. The set()
constructor takes an iterable object as input and creates a set out of it.
For example:
numbers = set([1,1,2,3,4])
print(numbers)
# Output: {1, 2, 3, 4}
The set()
constructor removes any duplicate elements from the iterable object.
We can also pass a string to set()
to create a set of unique characters:
chars = set("HelloWorld")
print(chars)
# Output: {'W', 'o', 'r', 'H', 'd', 'l', 'e'}
The set()
constructor can take any iterable object like lists, tuples, dictionaries, strings etc. as input. However, it cannot take unhashable elements as sets require elements to be hashable.
# Set from a list
set1 = set([1,2,3,4])
# Set from tuple
set2 = set((5,6,7,8))
# Set from dictionary keys
set3 = set({9: 'Nine', 10: 'Ten'})
# Cannot pass list or dictionary as
# elements since they are mutable
set4 = set([[1,2], [3,4]]) # TypeError
set5 = set({[5,6], [7,8]}) # TypeError
Initializing an Empty Set
We’ve seen that {}
alone won’t create an empty set in Python. The correct method is to use the set()
constructor without any arguments to explicitly initialize an empty set.
empty_set = set()
print(type(empty_set))
# Output: <class 'set'>
This initializes an empty set object we can add elements to later.
Alternatively, we can use the set
class with no arguments:
empty_set = set
print(type(empty_set))
# Output: <class 'type'>
This will create a set type object but doesn’t create an actual empty set like set()
does. To create an empty set out of it, we need to call the set
class:
empty_set = set()
print(type(empty_set))
# Output: <class 'set'>
In summary, set()
and set
both create set objects, but set()
is preferred for clarity and consistency.
Adding Elements to a Set
We can add new elements to a set using the add()
method. For example:
numbers = {1, 2, 3}
numbers.add(4)
print(numbers)
# Output: {1, 2, 3, 4}
The add()
method appends the new element to the set. Elements only get added if they are unique. Duplicate elements are ignored.
We can also add multiple elements using the update()
method:
numbers = {1, 2, 3}
numbers.update([3, 4, 5, 6])
print(numbers)
# Output: {1, 2, 3, 4, 5, 6}
The update()
method takes any iterable object and adds each element to the set if unique.
Removing Elements from a Set
To remove an element from a set, use the remove()
method:
numbers = {1, 2, 3, 4}
numbers.remove(3)
print(numbers)
# Output: {1, 2, 4}
If the element doesn’t exist, remove()
will raise a KeyError
.
We can also discard an element using the discard()
method:
numbers = {1, 2, 3, 4}
numbers.discard(5) # Doesn't raise error
print(numbers)
# Output: {1, 2, 3, 4}
If the discarded element does not exist in the set, discard()
will NOT raise any errors.
To remove and return an arbitrary element, use the pop()
method:
numbers = {1, 3, 5}
print(numbers.pop())
print(numbers)
# Output:
# 1
# {3, 5}
pop()
removes and returns a random element from the set. Sets are unordered so we don’t know which element gets removed.
To clear all elements from a set at once, use the clear()
method:
numbers = {1, 2, 3}
numbers.clear()
print(numbers)
# Output: set()
Set Union
To find the union of two or more sets, use the |
operator or union()
method:
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}
# Using | operator
print(A | B) # {1, 2, 3, 4, 5, 6}
# Using union() method
print(A.union(B)) # {1, 2, 3, 4, 5, 6}
Union returns a new set containing all the distinct elements from both sets, with duplicate elements removed.
The update()
method can also be used to find the union and update the original set:
A = {1, 2, 3}
B = {3, 4, 5}
A.update(B)
print(A)
# Output: {1, 2, 3, 4, 5}
Set Intersection
To find the common elements present in two or more sets, use the &
operator or intersection()
method:
A = {1, 2, 3, 4}
B = {2, 4, 5, 6}
# Using & operator
print(A & B) # {2, 4}
# Using intersection() method
print(A.intersection(B)) # {2, 4}
Intersection returns a new set containing only the common elements from both sets.
We can also find the intersection without creating a new set using intersection_update()
:
A = {1, 2, 3}
B = {2, 4, 5}
A.intersection_update(B)
print(A)
# Output: {2}
intersection_update()
modifies the original set to keep only the common elements.
Set Difference
To find the difference between two sets, use the -
operator or difference()
method:
A = {1, 2, 3, 4}
B = {2, 4, 5, 6}
# Using - operator
print(A - B) # {1, 3}
# Using difference() method
print(A.difference(B)) # {1, 3}
Difference returns a new set containing elements that are in the first set but NOT in the second set.
We can also find the difference in-place using difference_update()
:
A = {1, 2, 3}
B = {2, 4, 5}
A.difference_update(B)
print(A)
# Output: {1, 3}
difference_update()
modifies the original set by removing elements also found in the second set.
Set Symmetric Difference
To find elements present in either set but NOT in both, use the ^
operator or symmetric_difference()
method:
A = {1, 2, 3, 4}
B = {2, 4, 5, 6}
# Using ^ operator
print(A ^ B) # {1, 3, 5, 6}
# Using symmetric_difference() method
print(A.symmetric_difference(B)) # {1, 3, 5, 6}
Symmetric difference returns elements exclusive to each set in a new set.
We can also use symmetric_difference_update()
to modify the original set:
A = {1, 2, 3}
B = {2, 4, 5}
A.symmetric_difference_update(B)
print(A)
# Output: {1, 3, 4, 5}
This updates A with elements exclusive to either A or B but not both.
Checking Subsets and Supersets
To check if a set is a subset or superset of another set, use the <=
and >=
operators:
A = {1, 2}
B = {1, 2, 3}
print(A <= B) # True
print(B <= A) # False
print(B >= A) # True
print(A >= B) # False
A <= B
checks if A is a subset of B while B >= A
checks if B is a superset of A.
We can also use the issuperset()
and issubset()
methods:
{1, 2}.issubset({1, 2, 3}) # True
{1, 2, 3}.issuperset({1, 2}) # True
This allows checking subsets and supersets without creating temporary sets.
Set Methods Summary
Here is a quick summary of common Python set methods:
Method | Description |
---|---|
add() | Adds an element to the set |
clear() | Removes all elements from the set |
copy() | Creates a shallow copy of the set |
difference() | Returns difference between two sets |
difference_update() | Removes elements present in another set |
discard() | Removes an element if present, no error |
intersection() | Returns intersection of two sets |
intersection_update() | Keeps only common elements |
isdisjoint() | Returns True if two sets have no overlap |
issubset() | Returns True if set is subset of another set |
issuperset() | Returns True if set is superset of another set |
pop() | Removes and returns a random element |
remove() | Removes specified element, raises error if not present |
symmetric_difference() | Returns elements exclusive to each set |
symmetric_difference_update() | Keeps elements exclusive to either set |
union() | Returns union of two sets |
update() | Adds elements from another set |
This summarizes the common methods to modify sets and compare them mathematically.
Set Membership Testing
Sets provide fast membership testing with the in
and not in
operators. For example:
numbers = {1, 2, 3}
print(1 in numbers) # True
print(4 not in numbers) # True
This allows quick checking if an element is contained in a set without searching the entire data structure. Sets utilize hashing under the hood to enable fast membership tests.
We can also combine in
and not in
with set comparisons like subsets:
A = {1, 2}
B = {1, 2, 3}
print(A <= B) # True
print(4 not in B) # True
Overall, membership testing makes sets ideal for fast data searches compared to lists or tuples.
Unhashable Elements Cannot be Stored in Sets
Sets in Python can only contain hashable elements. Hashable elements have a hash value that never changes during the element’s lifetime. Immutable objects like integers, floats, strings and tuples are hashable.
However, mutable objects like lists and dictionaries are unhashable. Their hash value can change which makes them unsuitable for sets.
For example:
# Works fine
set1 = {1, 2.0, 'Hello', (3,4)}
# Raises TypeError
set2 = {[1,2], {3,4}}
Attempting to add a list or dict to a set raises TypeError
. Sets can only contain hashable objects that don’t mutate.
Sets vs Lists and Tuples
-
Sets are unordered with no duplicate elements while lists and tuples maintain insertion order and allow duplicates.
-
Sets provide fast membership testing and deletion while lists and tuples are slower for these operations.
-
Sets take more memory than tuples but less than lists since they are immutable.
-
Sets are mutable unlike tuples but like lists. Elements can be added and removed freely.
-
Sets cannot contain unhashable elements unlike lists and tuples.
So in summary, use sets when element order doesn’t matter, duplicates need to be removed, and fast membership testing is required. Otherwise, lists or tuples may be preferable.
Conclusion
In this comprehensive guide, we explored how to create sets in Python using curly braces {}
and the set()
constructor. We covered the characteristics of sets, operations like unions, intersections, differences and symmetric differences. We also looked at common set methods like add, remove, copy, clear etc. and how to do membership testing with sets.
Sets are a powerful Python data type that can optimize code by removing duplicates and providing fast lookups. By mastering set operations and methods, you can write cleaner and more efficient Python code. The concepts and examples provided in this guide should help you feel confident in using sets for a wide range of applications.
Some scenarios where sets shine include removing duplicate elements from a sequence, performing mathematical set operations like intersections and differences, and doing fast membership testing. Sets have many uses across a wide variety of problem domains.
I hope you found this guide useful! Let me know if you have any other questions as you continue learning Python sets.