The Python standard library is a vast repository of reusable code and modules that are available for Python programmers to use right out of the box. This extensive collection of modules covers a wide array of functionality like text processing, data types, system calls, networking, concurrency, image manipulation, archiving, file formats and much more. Mastering the standard library is an important step towards becoming an efficient and proficient Python developer.
In this comprehensive guide, we will explore some of the most essential and useful modules that are part of Python’s standard library. We will look at what they are used for along with examples to illustrate their key features.
Table of Contents
Open Table of Contents
- Overview of the Python Standard Library
- System Modules
- Text Processing Modules
- Data Types Modules
- File and Directory Access Modules
- Data Compression and Archiving
- Cryptographic Services
- Internet Protocols and Support
- Multimedia Modules
- Database Access
- Web Programming Modules
- Concurrency and Parallelism
- Interprocess Communication
- Graphical User Interface Modules
- Testing Modules
Overview of the Python Standard Library
The Python standard library contains a set of modules that provide access to system functionality and pre-built components which can be integrated into Python programs. This avoids having to reinvent the wheel or rewrite commonly used codes from scratch.
Some highlights of the Python standard library:
- It is stable, well documented and cross-platform compatible.
- Modules are written in C or Python for performance and extensibility.
- New modules are added in major Python releases while deprecated ones are removed.
- Modules like
os
,sys
,datetime
are indispensible for most Python programmers. - Domain specific modules like
html
,http
,zipfile
are also included. - External libraries can integrate seamlessly with standard library modules.
The standard library modules are well organized into functional categories which we will explore next.
System Modules
System modules interact with the interpreter and provide system-specific functionalities. Some commonly used system modules are:
The os Module
The os
module provides functions to interact with the operating system. It can be used to read environment variables, create processes, change directories, and much more.
import os
# Get current working directory
current_dir = os.getcwd()
# List all files and directories
files = os.listdir('/User')
# Change directory
os.chdir('/User/Documents')
# Create directory
os.mkdir('newdir')
# Remove directory
os.rmdir('newdir')
The sys Module
The sys
module provides system-specific information and functions. It can be used to get command line arguments, system platform details, exit the interpreter and more.
import sys
# Command line arguments
print(sys.argv)
# System platform
print(sys.platform)
# Python version
print(sys.version)
# Exit the interpreter
sys.exit()
The getopt Module
The getopt
module helps in parsing command line arguments and options. It supports short and long option strings.
import getopt
opts, args = getopt.getopt(sys.argv[1:], 'ho:v', ['help','output='])
for opt, arg in opts:
if opt in ('-h', '--help'):
print('Display help')
elif opt in ('-o', '--output'):
print('Output file:', arg)
elif opt in ('-v'):
print('Version')
This makes it easy to write command-line interface programs by handling options.
Text Processing Modules
Python has powerful inbuilt text processing capabilities. Let’s go through some useful text handling modules.
The re Module
This module provides regular expression support in Python. It can be used for search, pattern matching and string manipulations.
import re
# Search for pattern
regex = r"aza"
if re.search(regex, "plaza"):
print("Match found")
# Substitute text
str = "The quick brown fox"
print(re.sub('quick', 'lazy', str)) #The lazy brown fox
# Extract number from string
regex = r"(\d{4})"
matches = re.search(regex, "ID is 1234")
print(matches.group(1)) #1234
The string Module
The string
module contains many useful constants and classes for string manipulation.
import string
# Useful string constants
print(string.ascii_letters)
print(string.digits)
# Template class for string substitutions
t = string.Template("$name is $years old")
print(t.substitute(name="John", years=20))
The textwrap Module
The textwrap
module helps formatting text by splitting paragraphs, adding indentation etc.
import textwrap
text = "This is a very long text that needs to be wrapped properly."
# Wrap text
wrapped = textwrap.fill(text, width=50)
print(wrapped)
# Add indentation
wrapped = textwrap.indent(wrapped, '> ')
print(wrapped)
This allows modifying text easily for various display widths.
Data Types Modules
Python comes with many modules that provide advanced data types to hold complex data.
The datetime Module
The datetime
module provides classes for working with dates and times.
import datetime
# Current date and time
now = datetime.datetime.now()
print(now)
# Date object
date = datetime.date(2019, 5, 12)
print(date.year)
# Time object
time = datetime.time(11, 34, 56)
print(time.minute)
# Arithmetic operations
from datetime import timedelta
print(now + timedelta(days=2))
This makes handling and manipulating dates much easier and intuitive.
The math Module
The math
module provides mathematical functions and constants.
import math
# Trigonometric functions
print(math.sin(2))
print(math.cos(math.pi))
# Logarithms
print(math.log(1024,2))
# Maximum and minimum
print(math.pow(2,3))
print(math.ceil(3.7))
print(math.floor(3.7))
This module saves having to rewrite commonly used math functions.
The random Module
The random
module generates pseudo-random numbers which are useful for simulations, games and randomized algorithms.
import random
# Random float: 0.0 to 1.0
print(random.random())
# Random integer: between a-b
print(random.randint(1,10))
# Choice from sequence
print(random.choice([1,5,7]))
# Shuffle a list
nums = [1,5,7,4]
random.shuffle(nums)
print(nums)
This provides easy generation of random data for various use cases.
The statistics Module
The statistics
module provides functions for mathematical statistics operations on numeric data.
import statistics
data = [2.5, 3.7, 5.2, 1.6]
# Mean
print(statistics.mean(data))
# Median
print(statistics.median(data))
# Mode
print(statistics.mode(data))
# Standard deviation
print(statistics.stdev(data))
This makes statistical analysis easy without having to write long statistical functions.
File and Directory Access Modules
Python provides many modules to work with files and directories on your file system.
The pathlib Module
The pathlib
module provides an object-oriented approach to work with files and directories.
from pathlib import Path
# Folder Path
folder = Path("/User/Documents")
print(folder.exists())
# File Path
file = Path("data.txt")
# Read file
print(file.read_text())
# Write file
file.write_text("Sample text")
This provides a cleaner way to handle file system paths.
The os.path Module
The os.path
module contains functions to work with file and directory paths.
import os.path
# Join paths
print(os.path.join('User','test.txt'))
# Check if exists
print(os.path.exists('file.txt'))
# File size
print(os.path.getsize('file.txt'))
# File modification time
print(os.path.getmtime('file.txt'))
This helps in handling file system paths in a portable manner.
The glob Module
The glob
module finds files and directories matching a pattern.
import glob
# Find files starting with text
print(glob.glob('*.txt'))
# Find Python files
print(glob.glob('*.py'))
# Case insensitive search
print(glob.glob('*.PY', recursive=True))
This provides Unix-like pathname expansion capabilities.
The shutil Module
The shutil
module offers high-level file operations like copying files, directories and permissions.
import shutil
# Copy file
shutil.copy('src.txt', 'dst.txt')
# Copy directory
shutil.copytree('src', 'dst')
# Move file
shutil.move('test.txt','newdir')
This provides a easy way to automate frequent file operations.
Data Compression and Archiving
Python has modules that support creation and unpacking of archive files.
The zlib Module
The zlib
module provides data compression and decompression using zlib library.
import zlib
# Compress
data = b"This will be compressed"
comp_data = zlib.compress(data)
# Decompress
original_data = zlib.decompress(comp_data)
This allows compressing and decompressing bytes data.
The gzip Module
The gzip
module supports gzip file compression and decompression.
import gzip
# Open compressed gzip file
with gzip.open('file.txt.gz', 'rb') as f:
file_content = f.read()
# Write compressed gzip file
with gzip.open('comp_file.txt.gz', 'wb') as f:
f.write(b'Hello World')
This provides a handy way to work with gzip compressed files.
The zipfile Module
The zipfile
module provides support for zip archive files.
import zipfile
# Read zip archive
with zipfile.ZipFile('file.zip', 'r') as z:
z.namelist()
z.read('test.txt')
# Write zip archive
with zipfile.ZipFile('new.zip','w') as z:
z.write('file.txt')
This makes it easy to read and write .zip archives in Python.
Cryptographic Services
Python has modules that support cryptographic services for securing data.
The hashlib Module
The hashlib
module implements secure hash algorithms that generates fixed size digests from arbitrary data.
import hashlib
# MD5 hash
hash_md5 = hashlib.md5()
hash_md5.update(b"Hello")
print(hash_md5.hexdigest())
# SHA256 hash
hash_sha256 = hashlib.sha256()
hash_sha256.update(b"secure text")
print(hash_sha256.hexdigest())
This makes generating cryptographic hashes easy.
The secrets Module
The secrets
module is used to generate cryptographically strong random numbers suitable for security applications.
import secrets
# Generate secure token
token = secrets.token_hex(16)
print(token)
# Generate secure random number
rand = secrets.randbelow(10)
print(rand)
This is preferred over the random
module for security or cryptography use cases.
Internet Protocols and Support
Python has good support for most internet protocols and standards.
The urllib Module
The urllib
module allows making HTTP requests and opening remote URLs.
from urllib import request
# HTTP GET request
resp = request.urlopen("https://httpbin.org/get")
print(resp.read())
# Custom headers
headers = {'User-Agent': 'Python3'}
req = request.Request("https://httpbin.org/get", headers=headers)
resp = request.urlopen(req)
print(resp.headers)
This provides a simple way to access resources across the web.
The json Module
The json
module provides JSON support with functions to encode and decode data.
import json
# Parse JSON
data = '{"name": "John", "age": 30}'
user = json.loads(data)
print(user)
# Convert to JSON
cars = [{'make':'bmw', 'model':320}, {'make':'audi', 'model':550}]
car_data = json.dumps(cars)
print(car_data)
This makes conversion to and from JSON easy when working with web APIs.
The smtplib Module
The smtplib
module provides SMTP client functionality to send emails.
import smtplib
# Connect to SMTP server
smtp = smtplib.SMTP('smtp.mailserver.com', 587)
# Send email
email_msg = """
From: [email protected]
To: [email protected]
Subject: Email from Python
Hello,
This is a test email sent from Python!
"""
smtp.sendmail('[email protected]', '[email protected]', email_msg)
smtp.quit()
This offers a simple way to automate sending emails from Python scripts.
The poplib Module
The poplib
module implements POP3 client functionality to access email from a POP server.
import poplib
# Connect to POP3 server
pop_conn = poplib.POP3_SSL('pop.mailserver.com')
pop_conn.user('username')
pop_conn.pass_('password')
# Get message stats
messages = [pop_conn.stat()[0], pop_conn.stat()[1]]
# Print message
for i in range(messages[0]):
for msg in pop_conn.retr(i+1)[1]:
print(msg)
# Close connection
pop_conn.quit()
This provides a way to easily fetch email messages using POP3 in Python.
Multimedia Modules
Python provides modules for manipulating multimedia files like images, audio and video.
The imghdr Module
The imghdr
module determines the type of image files based on magic number in the contents.
import imghdr
file_type = imghdr.what('/User/image.jpg')
print(file_type) # jpeg
This can identify and validate different image file formats.
The PIL Module
The Python Imaging Library(PIL) module provides image processing capabilities.
from PIL import Image
# Open image
img = Image.open('image.jpg')
# Image information
print(img.format, img.size, img.mode)
# Thumbnail
img.thumbnail((128, 128))
img.save('thumbnail.jpg')
This makes manipulation like resizing, cropping, filtering easy to implement programmatically.
The pygame Module
The pygame
module enables game development and multimedia applications.
import pygame
# Initialize pygame
pygame.init()
# Create game screen
screen = pygame.display.set_mode((600,400))
# Load image
image = pygame.image.load('img.png')
# Play sound
sound = pygame.mixer.Sound('sound.wav')
sound.play()
This provides useful functionality for building games, playing audio etc.
Database Access
Python interfaces with databases using dedicated modules.
The sqlite3 Module
The sqlite3
module enables working with SQLite databases which are serverless and embedded into the end program.
import sqlite3
# Connect to SQLite database
conn = sqlite3.connect('database.db')
# Execute SQL query
cursor = conn.execute("SELECT * FROM users")
# Fetch data
for row in cursor:
print(row)
# Insert data
conn.execute("INSERT INTO users VALUES ('John', 25)")
# Save changes
conn.commit()
# Close connection
conn.close()
This provides a light-weight data storage option that comes built into Python.
The mysql-connector Module
The mysql-connector
module allows accessing a MySQL database.
import mysql.connector
# Connect to MySQL database
conn = mysql.connector.connect(
host="localhost",
user="root",
password="password",
database="sales"
)
# Query database
cursor = conn.cursor()
cursor.execute("SELECT * FROM customer")
# Print data
for row in cursor:
print(row)
# Close connection
conn.close()
This makes interaction with MySQL servers easy and straightforward.
The psycopg2 Module
The psycopg2
module enables working with PostgreSQL databases.
import psycopg2
# Connect to PostgreSQL
conn = psycopg2.connect(
host="localhost",
database="testdb",
user="postgres",
password="secret"
)
# Execute query
cursor = conn.cursor()
cursor.execute("SELECT version()")
# Fetch row
data = cursor.fetchone()
print(data)
# Close connection
conn.close()
This provides an efficient way to connect to Postgres from Python.
Web Programming Modules
Python has good frameworks available for web application development.
The cgi Module
The cgi
module facilitates development of Python scripts for web servers and browsers.
import cgi
# Get form data
form = cgi.FieldStorage()
name = form.getvalue('name')
print("Content-Type: text/html")
print()
print("<h1> Hello " + name + "</h1>")
This allows processing and responding to HTTP requests from web forms.
The webbrowser Module
The webbrowser
module provides functionality to open web browsers from Python.
import webbrowser
# Open browser to Google
webbrowser.open('https://google.com')
# Open multiple pages
webbrowser.open_new_tab('https://docs.python.org')
webbrowser.open_new('https://github.com')
This facilitates opening web pages programatically from Python scripts.
The urllib.parse Module
The urllib.parse
module provides handy functions to handle URLs.
from urllib.parse import urlparse
url = 'http://netloc/path;parameters?query=args#fragment'
# Parse URL components
result = urlparse(url)
print(result)
# Combine components into URL
from urllib.parse import urlunparse
data = ['http','netloc','path','parameters','query','fragment']
print(urlunparse(data))
This simplifies dissecting and constructing URLs when working with web apps.
Concurrency and Parallelism
Python provides constructs to implement concurrent and parallel execution of code.
The threading Module
The threading
module supports spawning multiple threads to run code concurrently within a process.
import threading
# Thread class
class PrintThread(threading.Thread):
def run(self):
print(threading.current_thread().getName())
# Create threads
t1 = PrintThread(name="Thread-1")
t2 = PrintThread(name="Thread-2")
# Start threads
t1.start()
t2.start()
# Wait for completion
t1.join()
t2.join()
This allows basic concurrent operations using threads.
The multiprocessing Module
The multiprocessing
module provides API similar to threading
but with multiple processes instead of threads.
from multiprocessing import Process
# Process class
class PrintProcess(Process):
def run(self):
print(threading.current_thread().getName())
# Create processes
p1 = PrintProcess(name="Process-1")
p2 = PrintProcess(name="Process-2")
# Run processes
p1.start()
p2.start()
# Wait to finish
p1.join()
p2.join()
This supports more efficient parallelism by distributing work across processes.
The concurrent.futures Module
The concurrent.futures
module provides high level abstractions like thread pools and process pools for concurrent operations.
from concurrent.futures import ThreadPoolExecutor
# Function to be executed
def print_thread(name):
print(name)
# Create thread pool
pool = ThreadPoolExecutor(max_workers=2)
# Submit task to pool
pool.submit(print_thread, name="Thread1")
pool.submit(print_thread, name="Thread2")
# Shut down pool
pool.shutdown()
This module offers a clean interface for managing concurrency via pools of threads and processes.
Interprocess Communication
Python has modules that allow separate processes to communicate with each other.
The multiprocessing.managers Module
The multiprocessing.managers
module supports sharing objects between processes to coordinate their actions.
from multiprocessing import Process, Manager
# Manager to create shared objects
manager = Manager()
shared_dict = manager.dict()
# Processes can access shared object
def update_dict(d):
d['key'] = 'value'
if __name__ == "__main__":
p = Process(target=update_dict, args=(shared_dict,))
p.start()
p.join()
# Print updated value
print(shared_dict['key'])
This enables simple synchronization between processes.
The multiprocessing.queues Module
The multiprocessing.queues
module provides queue data structures to pass messages safely between processes.
from multiprocessing import Process, Queue
# Function to handle queue
def handle_queue(q):
while True:
msg = q.get()
print(msg)
# Main process
if __name__ == "__main__":
q = Queue()
p = Process(target=handle_queue, args=(q,))
p.start()
q.put('Hello')
q.put('World')
# Wait for process
p.join()
Queues offer reliable communication between processes executing independently.
The socket Module
The socket
module provides access to BSD socket interface for inter-process communication using a network interface.
import socket
# Create TCP socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Bind and listen for connections
s.bind(('localhost', 5000))
s.listen(1)
while True:
# Wait for client connection
conn, addr = s.accept()
# Read message
data = conn.recv(1024)
print('Received:', data.decode())
# Send reply
conn.send(b'Acknowledged')
# Close connection
conn.close()
Sockets allow bidirectional client-server communication between processes using sockets.
Graphical User Interface Modules
Python supports creating desktop GUIs using bindings to popular cross-platform widget toolkits.
Tkinter Module
The tkinter
module is Python’s default GUI framework that wraps Tcl/Tk widgets.
import tkinter as tk
# Root window
root = tk.Tk()
# Widgets
btn = tk.Button(root, text="Click Me")
lbl = tk.Label(root, text="Hello GUI")
# Layout
btn.pack()
lbl.pack()
# Start GUI loop
root.mainloop()
Tkinter provides simple and intuitive way to build GUI apps that run on Windows, Mac and Linux.
PyQt Module
PyQt provides Python bindings for the Qt application framework and tools for building UIs.
from PyQt5.QtWidgets import QApplication, QLabel
app = QApplication([])
label = QLabel('Hello PyQt')
label.show()
app.exec_()
Qt based tools like PyQt and PySide are popular choices for building responsive UIs.
Testing Modules
Python comes with a few modules to support writing tests for programs.
The unittest Module
The unittest
module provides a unit testing framework to implement test cases and test suites.
import unittest
# Test case
class TestStringMethods(unittest.TestCase):
def test_upper(self):
self.assertEqual('abc'.upper(), 'ABC')
def test_lower(self):
self.assertEqual('ABC'.lower(), 'abc')
if __name__ == '__main__':
unittest.main()
This allows automated testing during development with support for test fixtures, assertions and mocking.
The doctest Module
The doctest
module allows creating test cases by extracting snippets from documentation strings in source code.
def sum(a, b):
"""
>>> sum(2, 3)
5
>>> sum(5, 7)
12
"""
return a + b
if __name__ == "__main__":
import doctest
doctest.testmod()
Doctests ensure code examples in documentation remain valid and up to date.
This covers some of the most useful modules available in Python’s standard library. There are many more modules for specialized domains like networking, web frameworks, scientific computing, container data types etc. Check the Python documentation for a complete list of standard library modules.
The standard library truly makes Python a batteries included language with reusable modules to quickly build applications, scripts, frameworks and tools. Learning to leverage the standard library effectively can boost productivity and reduce reinvention of the wheel!