File input/output (I/O) is an essential aspect of programming in any language. Python provides many useful built-in functions and methods to handle files and streamline common file operations. Mastering file I/O skills allows developers to read and write files for data processing, storage, and analysis.
This guide offers practical exercises and examples to help Python programmers of all skill levels gain hands-on experience with file handling. We will cover key concepts like reading and writing files, handling file-related exceptions, parsing and processing text file data, creating data logs, and more. Real-world applications of file I/O in data science, machine learning, and software development are also discussed.
Table of Contents
Open Table of Contents
Reading and Writing Files in Python
The core Python library provides several functions to work with files. Let’s look at some practical examples of reading and writing files using Python.
Opening and Closing Files
To start file operations in Python, we first need to open the file. This is done using the built-in open()
function.
file = open('data.txt', 'r')
The open()
function takes the file path and mode as parameters. Here ‘r’ mode opens the file for reading. Other common modes include ‘w’ for writing, ‘a’ for appending, and ‘r+’ for reading and writing.
It’s good practice to close the file once operations are completed using the close()
method.
file.close()
Reading File Contents
There are several methods available to read a file’s contents:
read()
- Returns the entire contents as a stringreadline()
- Reads a single line at a timereadlines()
- Returns a list of lines in the file
text = file.read() # read entire file
print(text)
line = file.readline() # read first line
print(line)
lines = file.readlines() # read all lines into list
for line in lines:
print(line)
Writing to Files
To write to files, we need to open them in write ‘w’, append ‘a’ or read/write ‘r+’ mode.
write()
- Writes a string to the filewritelines()
- Writes a list of strings to the file
file = open('data.txt','w')
file.write('This is some text')
lines = ['First line\n','Second line\n']
file.writelines(lines)
file.close()
This overwrites the existing file contents. To append instead, open the file with ‘a’ mode.
With Statement
It is advisable to use the with
statement while dealing with file objects. This automatically handles opening and closing the file.
with open('data.txt','r') as file:
data = file.read()
# file handling code
# file is automatically closed
Handling File I/O Exceptions
Like any other operations in Python, file handling is prone to exceptions. Let’s look at some common file I/O exceptions and ways to handle them.
FileNotFoundError
This exception is raised when a file passed to the open()
function does not exist at the specified path. We can catch this and print a user-friendly error message:
try:
file = open('data.txt')
except FileNotFoundError as e:
print('Error! File not found.')
Alternatively, we can first check if the file exists using os.path
module before trying to open it.
import os
if os.path.exists('data.txt'):
file = open('data.txt')
else:
print('File does not exist.')
PermissionError
A PermissionError occurs when trying to access a file without adequate permissions. We can catch and handle it as:
try:
file = open('protected_data.txt')
except PermissionError:
print('You do not have permission to access this file')
IsADirectoryError
This is raised when you try to open a directory instead of a file.
try:
file = open('foldername')
except IsADirectoryError:
print('You tried to open a directory')
Handling Other Exceptions
We can also generically handle any exceptions using the base Exception class:
try:
file = open('data.txt')
# file operations
except Exception as e:
print('An error occurred: ', e)
This allows us to catch any issues when working with files and print the specific error message.
Parsing and Processing Data from Text Files
Text files like CSV, TSV, JSON, or log files often contain useful structured data. Let’s see how to parse and process such data in Python using some practical exercises.
Reading CSV Files
Comma-separated values (CSV) files store tabular data separated by commas or other delimiters. The csv
module provides functionality to parse CSV data.
import csv
with open('data.csv') as file:
csv_reader = csv.reader(file, delimiter=',')
for row in csv_reader:
print(row)
This prints each row as a list. We can index columns by name instead of position using a dictionary reader:
import csv
with open('data.csv') as file:
csv_reader = csv.DictReader(file)
for row in csv_reader:
print(row['Name'], row['Email'])
Processing JSON Data
JSON (JavaScript Object Notation) is a common lightweight data format. Let’s see how to parse a JSON file using the json module.
import json
with open('data.json') as file:
data = json.load(file)
print(data['users'][0]['name']) # access nested data
We can also convert a Python dict to JSON and write it to a file.
import json
data = {
'name': 'John',
'age': 30
}
with open('data.json', 'w') as file:
json.dump(data, file)
Extracting Data from Log Files
Log files store timestamped application messages. They can be parsed using regular expressions to extract relevant data.
import re
import datetime
with open('app.log') as file:
for line in file:
match = re.search('\[(.+?)\] - (.*)', line)
if match:
timestamp = datetime.datetime.strptime(match.group(1), '%Y-%m-%d %H:%M:%S')
print(timestamp, '- ', match.group(2))
This parses each log line to extract the timestamp and message data.
Processing Text Files
For simple text processing, we can read line by line and use string methods:
with open('data.txt') as file:
for line in file:
line = line.strip()
if 'error' in line.lower():
print(line)
This strips extra whitespace, lowercases each line and checks for the word ‘error’.
Writing Data to Files
Let’s look at some examples of writing data to files in different formats.
Writing to CSV
We can use the csv
module to write dictionaries to a CSV file:
import csv
data = [
{'name': 'John', 'age': 20},
{'name': 'Mark', 'age': 25}
]
with open('output.csv', 'w') as file:
writer = csv.DictWriter(file, fieldnames=['name', 'age'])
writer.writeheader()
writer.writerows(data)
Writing JSON Data
To write Python data structures as JSON we use json.dump()
:
import json
data = {
'users': [
{'name': 'John', 'age': 20},
{'name': 'Mark', 'age': 25}
]
}
with open('data.json', 'w') as file:
json.dump(data, file)
Creating Log Files
We can create a simple log file by appending timestamped messages:
from datetime import datetime
with open('logs.txt', 'a') as file:
message = f'[{datetime.now()}] - Info message'
file.write(message + '\n')
This will create a basic log file with timestamp and messages.
File I/O Applications and Use Cases
There are many real-world applications and use cases where file handling in Python provides an efficient solution.
Data Analysis Pipeline
In a typical data science pipeline, data from files like CSV, JSON or databases is ingested, processed, modeled and results are exported to files for reporting. Python file I/O is useful in each stage for data access.
Log Processing
Server, application and device logs in text formats can be parsed in Python for monitoring, auditing and debugging. Log data helps identify issues and improves analytics.
Data Pipelines
For production data pipelines, file I/O allows batch processing of large datasets for ETL, migration or integration across databases, data warehouses and other systems.
Configuration Files
App configuration data such as credentials, URLs, options are often stored in JSON, YAML, INI files. Python’s file handling features help access these easily.
Automation Scripts
Sysadmin tasks like network configuration, server monitoring, database administration can be automated using Python scripts that read and update system text files.
Testing and Simulation
File I/O enables creating test data, injecting it into systems and validating results by reading output files. This assists in testing real-world scenarios.
Backup and Archival
Large amounts of log data or database backups can be compressed and archived to files for long-term storage using Python’s file handling capabilities.
Content Creation
Structured content stored in Markdown, HTML, JSON and other text formats can be dynamically generated using Python scripts for blogs, documentation sites and more.
By mastering Python file I/O skills through hands-on exercises, developers can build robust data-driven solutions for the diverse real-world use cases discussed above.
Summary
Effective file handling is indispensable for any Python programmer. This guide provided a collection of practical exercises to help you learn key file I/O concepts like:
- Opening, reading and writing files using Python core functions
- Handling common file-related exceptions
- Parsing and processing structured data from CSV, JSON and text files
- Writing data to files in different formats like CSV and JSON
- Creating log files and extracting data from them
- Real-world applications and use cases of file I/O like data pipelines, testing, automation, content creation and more
These skills allow you to read, extract, transform and store data from a variety of file formats. Mastering file I/O opens up many possibilities for building data-driven solutions using Python. The exercises covered serve as a solid foundation to apply file handling techniques to tackle real-world problems.