Skip to content

In-Depth Guide to Python's Standard Library Modules

Updated: at 04:23 AM

The Python standard library is a vast repository of reusable code and modules that are available for Python programmers to use right out of the box. This extensive collection of modules covers a wide array of functionality like text processing, data types, system calls, networking, concurrency, image manipulation, archiving, file formats and much more. Mastering the standard library is an important step towards becoming an efficient and proficient Python developer.

In this comprehensive guide, we will explore some of the most essential and useful modules that are part of Python’s standard library. We will look at what they are used for along with examples to illustrate their key features.

Table of Contents

Open Table of Contents

Overview of the Python Standard Library

The Python standard library contains a set of modules that provide access to system functionality and pre-built components which can be integrated into Python programs. This avoids having to reinvent the wheel or rewrite commonly used codes from scratch.

Some highlights of the Python standard library:

The standard library modules are well organized into functional categories which we will explore next.

System Modules

System modules interact with the interpreter and provide system-specific functionalities. Some commonly used system modules are:

The os Module

The os module provides functions to interact with the operating system. It can be used to read environment variables, create processes, change directories, and much more.

import os

# Get current working directory
current_dir = os.getcwd()

# List all files and directories
files = os.listdir('/User')

# Change directory
os.chdir('/User/Documents')

# Create directory
os.mkdir('newdir')

# Remove directory
os.rmdir('newdir')

The sys Module

The sys module provides system-specific information and functions. It can be used to get command line arguments, system platform details, exit the interpreter and more.

import sys

# Command line arguments
print(sys.argv)

# System platform
print(sys.platform)

# Python version
print(sys.version)

# Exit the interpreter
sys.exit()

The getopt Module

The getopt module helps in parsing command line arguments and options. It supports short and long option strings.

import getopt

opts, args = getopt.getopt(sys.argv[1:], 'ho:v', ['help','output='])

for opt, arg in opts:
  if opt in ('-h', '--help'):
    print('Display help')
  elif opt in ('-o', '--output'):
    print('Output file:', arg)
  elif opt in ('-v'):
    print('Version')

This makes it easy to write command-line interface programs by handling options.

Text Processing Modules

Python has powerful inbuilt text processing capabilities. Let’s go through some useful text handling modules.

The re Module

This module provides regular expression support in Python. It can be used for search, pattern matching and string manipulations.

import re

# Search for pattern
regex = r"aza"
if re.search(regex, "plaza"):
  print("Match found")

# Substitute text
str = "The quick brown fox"
print(re.sub('quick', 'lazy', str)) #The lazy brown fox

# Extract number from string
regex = r"(\d{4})"
matches = re.search(regex, "ID is 1234")
print(matches.group(1)) #1234

The string Module

The string module contains many useful constants and classes for string manipulation.

import string

# Useful string constants
print(string.ascii_letters)
print(string.digits)

# Template class for string substitutions
t = string.Template("$name is $years old")
print(t.substitute(name="John", years=20))

The textwrap Module

The textwrap module helps formatting text by splitting paragraphs, adding indentation etc.

import textwrap

text = "This is a very long text that needs to be wrapped properly."

# Wrap text
wrapped = textwrap.fill(text, width=50)
print(wrapped)

# Add indentation
wrapped = textwrap.indent(wrapped, '> ')
print(wrapped)

This allows modifying text easily for various display widths.

Data Types Modules

Python comes with many modules that provide advanced data types to hold complex data.

The datetime Module

The datetime module provides classes for working with dates and times.

import datetime

# Current date and time
now = datetime.datetime.now()
print(now)

# Date object
date = datetime.date(2019, 5, 12)
print(date.year)

# Time object
time = datetime.time(11, 34, 56)
print(time.minute)

# Arithmetic operations
from datetime import timedelta
print(now + timedelta(days=2))

This makes handling and manipulating dates much easier and intuitive.

The math Module

The math module provides mathematical functions and constants.

import math

# Trigonometric functions
print(math.sin(2))
print(math.cos(math.pi))

# Logarithms
print(math.log(1024,2))

# Maximum and minimum
print(math.pow(2,3))
print(math.ceil(3.7))
print(math.floor(3.7))

This module saves having to rewrite commonly used math functions.

The random Module

The random module generates pseudo-random numbers which are useful for simulations, games and randomized algorithms.

import random

# Random float: 0.0 to 1.0
print(random.random())

# Random integer: between a-b
print(random.randint(1,10))

# Choice from sequence
print(random.choice([1,5,7]))

# Shuffle a list
nums = [1,5,7,4]
random.shuffle(nums)
print(nums)

This provides easy generation of random data for various use cases.

The statistics Module

The statistics module provides functions for mathematical statistics operations on numeric data.

import statistics

data = [2.5, 3.7, 5.2, 1.6]

# Mean
print(statistics.mean(data))

# Median
print(statistics.median(data))

# Mode
print(statistics.mode(data))

# Standard deviation
print(statistics.stdev(data))

This makes statistical analysis easy without having to write long statistical functions.

File and Directory Access Modules

Python provides many modules to work with files and directories on your file system.

The pathlib Module

The pathlib module provides an object-oriented approach to work with files and directories.

from pathlib import Path

# Folder Path
folder = Path("/User/Documents")
print(folder.exists())

# File Path
file = Path("data.txt")

# Read file
print(file.read_text())

# Write file
file.write_text("Sample text")

This provides a cleaner way to handle file system paths.

The os.path Module

The os.path module contains functions to work with file and directory paths.

import os.path

# Join paths
print(os.path.join('User','test.txt'))

# Check if exists
print(os.path.exists('file.txt'))

# File size
print(os.path.getsize('file.txt'))

# File modification time
print(os.path.getmtime('file.txt'))

This helps in handling file system paths in a portable manner.

The glob Module

The glob module finds files and directories matching a pattern.

import glob

# Find files starting with text
print(glob.glob('*.txt'))

# Find Python files
print(glob.glob('*.py'))

# Case insensitive search
print(glob.glob('*.PY', recursive=True))

This provides Unix-like pathname expansion capabilities.

The shutil Module

The shutil module offers high-level file operations like copying files, directories and permissions.

import shutil

# Copy file
shutil.copy('src.txt', 'dst.txt')

# Copy directory
shutil.copytree('src', 'dst')

# Move file
shutil.move('test.txt','newdir')

This provides a easy way to automate frequent file operations.

Data Compression and Archiving

Python has modules that support creation and unpacking of archive files.

The zlib Module

The zlib module provides data compression and decompression using zlib library.

import zlib

# Compress
data = b"This will be compressed"
comp_data = zlib.compress(data)

# Decompress
original_data = zlib.decompress(comp_data)

This allows compressing and decompressing bytes data.

The gzip Module

The gzip module supports gzip file compression and decompression.

import gzip

# Open compressed gzip file
with gzip.open('file.txt.gz', 'rb') as f:
  file_content = f.read()

# Write compressed gzip file
with gzip.open('comp_file.txt.gz', 'wb') as f:
  f.write(b'Hello World')

This provides a handy way to work with gzip compressed files.

The zipfile Module

The zipfile module provides support for zip archive files.

import zipfile

# Read zip archive
with zipfile.ZipFile('file.zip', 'r') as z:
  z.namelist()
  z.read('test.txt')

# Write zip archive
with zipfile.ZipFile('new.zip','w') as z:
  z.write('file.txt')

This makes it easy to read and write .zip archives in Python.

Cryptographic Services

Python has modules that support cryptographic services for securing data.

The hashlib Module

The hashlib module implements secure hash algorithms that generates fixed size digests from arbitrary data.

import hashlib

# MD5 hash
hash_md5 = hashlib.md5()
hash_md5.update(b"Hello")
print(hash_md5.hexdigest())

# SHA256 hash
hash_sha256 = hashlib.sha256()
hash_sha256.update(b"secure text")
print(hash_sha256.hexdigest())

This makes generating cryptographic hashes easy.

The secrets Module

The secrets module is used to generate cryptographically strong random numbers suitable for security applications.

import secrets

# Generate secure token
token = secrets.token_hex(16)
print(token)

# Generate secure random number
rand = secrets.randbelow(10)
print(rand)

This is preferred over the random module for security or cryptography use cases.

Internet Protocols and Support

Python has good support for most internet protocols and standards.

The urllib Module

The urllib module allows making HTTP requests and opening remote URLs.

from urllib import request

# HTTP GET request
resp = request.urlopen("https://httpbin.org/get")
print(resp.read())

# Custom headers
headers = {'User-Agent': 'Python3'}
req = request.Request("https://httpbin.org/get", headers=headers)
resp = request.urlopen(req)
print(resp.headers)

This provides a simple way to access resources across the web.

The json Module

The json module provides JSON support with functions to encode and decode data.

import json

# Parse JSON
data = '{"name": "John", "age": 30}'
user = json.loads(data)
print(user)

# Convert to JSON
cars = [{'make':'bmw', 'model':320}, {'make':'audi', 'model':550}]
car_data = json.dumps(cars)
print(car_data)

This makes conversion to and from JSON easy when working with web APIs.

The smtplib Module

The smtplib module provides SMTP client functionality to send emails.

import smtplib

# Connect to SMTP server
smtp = smtplib.SMTP('smtp.mailserver.com', 587)

# Send email
email_msg = """
From: [email protected]
To: [email protected]
Subject: Email from Python

Hello,
This is a test email sent from Python!
"""
smtp.sendmail('[email protected]', '[email protected]', email_msg)
smtp.quit()

This offers a simple way to automate sending emails from Python scripts.

The poplib Module

The poplib module implements POP3 client functionality to access email from a POP server.

import poplib

# Connect to POP3 server
pop_conn = poplib.POP3_SSL('pop.mailserver.com')
pop_conn.user('username')
pop_conn.pass_('password')

# Get message stats
messages = [pop_conn.stat()[0], pop_conn.stat()[1]]

# Print message
for i in range(messages[0]):
  for msg in pop_conn.retr(i+1)[1]:
    print(msg)

# Close connection
pop_conn.quit()

This provides a way to easily fetch email messages using POP3 in Python.

Multimedia Modules

Python provides modules for manipulating multimedia files like images, audio and video.

The imghdr Module

The imghdr module determines the type of image files based on magic number in the contents.

import imghdr

file_type = imghdr.what('/User/image.jpg')
print(file_type) # jpeg

This can identify and validate different image file formats.

The PIL Module

The Python Imaging Library(PIL) module provides image processing capabilities.

from PIL import Image

# Open image
img = Image.open('image.jpg')

# Image information
print(img.format, img.size, img.mode)

# Thumbnail
img.thumbnail((128, 128))
img.save('thumbnail.jpg')

This makes manipulation like resizing, cropping, filtering easy to implement programmatically.

The pygame Module

The pygame module enables game development and multimedia applications.

import pygame

# Initialize pygame
pygame.init()

# Create game screen
screen = pygame.display.set_mode((600,400))

# Load image
image = pygame.image.load('img.png')

# Play sound
sound = pygame.mixer.Sound('sound.wav')
sound.play()

This provides useful functionality for building games, playing audio etc.

Database Access

Python interfaces with databases using dedicated modules.

The sqlite3 Module

The sqlite3 module enables working with SQLite databases which are serverless and embedded into the end program.

import sqlite3

# Connect to SQLite database
conn = sqlite3.connect('database.db')

# Execute SQL query
cursor = conn.execute("SELECT * FROM users")

# Fetch data
for row in cursor:
  print(row)

# Insert data
conn.execute("INSERT INTO users VALUES ('John', 25)")

# Save changes
conn.commit()

# Close connection
conn.close()

This provides a light-weight data storage option that comes built into Python.

The mysql-connector Module

The mysql-connector module allows accessing a MySQL database.

import mysql.connector

# Connect to MySQL database
conn = mysql.connector.connect(
  host="localhost",
  user="root",
  password="password",
  database="sales"
)

# Query database
cursor = conn.cursor()
cursor.execute("SELECT * FROM customer")

# Print data
for row in cursor:
  print(row)

# Close connection
conn.close()

This makes interaction with MySQL servers easy and straightforward.

The psycopg2 Module

The psycopg2 module enables working with PostgreSQL databases.

import psycopg2

# Connect to PostgreSQL
conn = psycopg2.connect(
  host="localhost",
  database="testdb",
  user="postgres",
  password="secret"
)

# Execute query
cursor = conn.cursor()
cursor.execute("SELECT version()")

# Fetch row
data = cursor.fetchone()
print(data)

# Close connection
conn.close()

This provides an efficient way to connect to Postgres from Python.

Web Programming Modules

Python has good frameworks available for web application development.

The cgi Module

The cgi module facilitates development of Python scripts for web servers and browsers.

import cgi

# Get form data
form = cgi.FieldStorage()
name = form.getvalue('name')

print("Content-Type: text/html")
print()
print("<h1> Hello " + name + "</h1>")

This allows processing and responding to HTTP requests from web forms.

The webbrowser Module

The webbrowser module provides functionality to open web browsers from Python.

import webbrowser

# Open browser to Google
webbrowser.open('https://google.com')

# Open multiple pages
webbrowser.open_new_tab('https://docs.python.org')
webbrowser.open_new('https://github.com')

This facilitates opening web pages programatically from Python scripts.

The urllib.parse Module

The urllib.parse module provides handy functions to handle URLs.

from urllib.parse import urlparse

url = 'http://netloc/path;parameters?query=args#fragment'

# Parse URL components
result = urlparse(url)
print(result)

# Combine components into URL
from urllib.parse import urlunparse
data = ['http','netloc','path','parameters','query','fragment']
print(urlunparse(data))

This simplifies dissecting and constructing URLs when working with web apps.

Concurrency and Parallelism

Python provides constructs to implement concurrent and parallel execution of code.

The threading Module

The threading module supports spawning multiple threads to run code concurrently within a process.

import threading

# Thread class
class PrintThread(threading.Thread):
  def run(self):
    print(threading.current_thread().getName())

# Create threads
t1 = PrintThread(name="Thread-1")
t2 = PrintThread(name="Thread-2")

# Start threads
t1.start()
t2.start()

# Wait for completion
t1.join()
t2.join()

This allows basic concurrent operations using threads.

The multiprocessing Module

The multiprocessing module provides API similar to threading but with multiple processes instead of threads.

from multiprocessing import Process

# Process class
class PrintProcess(Process):
  def run(self):
    print(threading.current_thread().getName())

# Create processes
p1 = PrintProcess(name="Process-1")
p2 = PrintProcess(name="Process-2")

# Run processes
p1.start()
p2.start()

# Wait to finish
p1.join()
p2.join()

This supports more efficient parallelism by distributing work across processes.

The concurrent.futures Module

The concurrent.futures module provides high level abstractions like thread pools and process pools for concurrent operations.

from concurrent.futures import ThreadPoolExecutor

# Function to be executed
def print_thread(name):
  print(name)

# Create thread pool
pool = ThreadPoolExecutor(max_workers=2)

# Submit task to pool
pool.submit(print_thread, name="Thread1")
pool.submit(print_thread, name="Thread2")

# Shut down pool
pool.shutdown()

This module offers a clean interface for managing concurrency via pools of threads and processes.

Interprocess Communication

Python has modules that allow separate processes to communicate with each other.

The multiprocessing.managers Module

The multiprocessing.managers module supports sharing objects between processes to coordinate their actions.

from multiprocessing import Process, Manager

# Manager to create shared objects
manager = Manager()
shared_dict = manager.dict()

# Processes can access shared object
def update_dict(d):
  d['key'] = 'value'

if __name__ == "__main__":
  p = Process(target=update_dict, args=(shared_dict,))
  p.start()
  p.join()

  # Print updated value
  print(shared_dict['key'])

This enables simple synchronization between processes.

The multiprocessing.queues Module

The multiprocessing.queues module provides queue data structures to pass messages safely between processes.

from multiprocessing import Process, Queue

# Function to handle queue
def handle_queue(q):
  while True:
    msg = q.get()
    print(msg)

# Main process
if __name__ == "__main__":
  q = Queue()
  p = Process(target=handle_queue, args=(q,))

  p.start()
  q.put('Hello')
  q.put('World')

  # Wait for process
  p.join()

Queues offer reliable communication between processes executing independently.

The socket Module

The socket module provides access to BSD socket interface for inter-process communication using a network interface.

import socket

# Create TCP socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# Bind and listen for connections
s.bind(('localhost', 5000))
s.listen(1)

while True:
  # Wait for client connection
  conn, addr = s.accept()

  # Read message
  data = conn.recv(1024)
  print('Received:', data.decode())

  # Send reply
  conn.send(b'Acknowledged')

  # Close connection
  conn.close()

Sockets allow bidirectional client-server communication between processes using sockets.

Graphical User Interface Modules

Python supports creating desktop GUIs using bindings to popular cross-platform widget toolkits.

Tkinter Module

The tkinter module is Python’s default GUI framework that wraps Tcl/Tk widgets.

import tkinter as tk

# Root window
root = tk.Tk()

# Widgets
btn = tk.Button(root, text="Click Me")
lbl = tk.Label(root, text="Hello GUI")

# Layout
btn.pack()
lbl.pack()

# Start GUI loop
root.mainloop()

Tkinter provides simple and intuitive way to build GUI apps that run on Windows, Mac and Linux.

PyQt Module

PyQt provides Python bindings for the Qt application framework and tools for building UIs.

from PyQt5.QtWidgets import QApplication, QLabel

app = QApplication([])
label = QLabel('Hello PyQt')

label.show()
app.exec_()

Qt based tools like PyQt and PySide are popular choices for building responsive UIs.

Testing Modules

Python comes with a few modules to support writing tests for programs.

The unittest Module

The unittest module provides a unit testing framework to implement test cases and test suites.

import unittest

# Test case
class TestStringMethods(unittest.TestCase):

  def test_upper(self):
    self.assertEqual('abc'.upper(), 'ABC')

  def test_lower(self):
    self.assertEqual('ABC'.lower(), 'abc')

if __name__ == '__main__':
  unittest.main()

This allows automated testing during development with support for test fixtures, assertions and mocking.

The doctest Module

The doctest module allows creating test cases by extracting snippets from documentation strings in source code.

def sum(a, b):
  """
  >>> sum(2, 3)
  5

  >>> sum(5, 7)
  12
  """

  return a + b

if __name__ == "__main__":
  import doctest
  doctest.testmod()

Doctests ensure code examples in documentation remain valid and up to date.

This covers some of the most useful modules available in Python’s standard library. There are many more modules for specialized domains like networking, web frameworks, scientific computing, container data types etc. Check the Python documentation for a complete list of standard library modules.

The standard library truly makes Python a batteries included language with reusable modules to quickly build applications, scripts, frameworks and tools. Learning to leverage the standard library effectively can boost productivity and reduce reinvention of the wheel!