An interpreted language like Python differs from a compiled language in how it is executed by the computer. This guide will provide a comprehensive overview of interpreted languages, with a focus on Python. We will compare and contrast compiled vs. interpreted languages, explain how the Python interpreter works, dive into the pros and cons of interpreted languages, and provide example code snippets along the way.
Table of Contents
Open Table of Contents
Introduction
Python is considered a high-level, general-purpose programming language that is interpreted, dynamically typed, and multi-paradigm. But what exactly does it mean for Python to be an interpreted language?
In short, an interpreted language like Python executes code line-by-line and does not require compilation before running. The Python interpreter reads each line of code, translates it to bytecodes, and immediately executes it. This differs from compiled languages like C, C++, or Rust that must be compiled entirely first into machine code before execution.
Understanding how interpreted languages work provides deeper insight into Python as a language. Let’s explore key differences between compiled and interpreted languages, examine how the Python interpreter functions, weigh pros and cons, and illustrate with code examples.
Compiled vs. Interpreted Languages
The main difference between compiled and interpreted languages is when the translation to machine code happens relative to execution.
Compiled Languages
In compiled languages like C, C++, Rust, Go, and Swift, the source code is passed through a compiler first. The compiler translates the high-level source code into low-level machine code all at once and outputs executable files.
These pre-compiled executable files can then be run on any machine that supports the compiled architecture. The compilation step only needs to happen once.
For example, consider this simple “hello world” program in C:
// helloworld.c
#include <stdio.h>
int main() {
printf("Hello World!");
return 0;
}
To run this C program, we first use a compiler like gcc to compile the helloworld.c
file:
$ gcc helloworld.c -o helloworld
This compiles the source code into a binary executable file helloworld
. We can then execute the compiled program anytime:
$ ./helloworld
Hello World!
The compilation step produces a portable, standalone executable that can be run on any compatible machine.
Interpreted Languages
In interpreted languages like Python, JavaScript, Ruby, and PHP, the source code is passed to an interpreter instead of a compiler.
The interpreter reads through the source code line-by-line and executes each command immediately. There is no separate compilation step that produces executable machine code.
For example, let’s look at an equivalent “hello world” program in Python:
# helloworld.py
print("Hello World!")
We can execute this Python code right away using the python
interpreter without any compilation:
$ python helloworld.py
Hello World!
The interpreter reads each line of the source code and translates it into bytecodes which are immediately executed. This line-by-line execution makes the development process faster and more flexible. We don’t have to compile the entire program first before testing small changes.
The tradeoff is that the source code must be distributed alongside the interpreter and instructions to run it. There are no standalone executable binaries like with compiled languages.
How the Python Interpreter Works
When you run a Python program, the source code is fed into the Python interpreter. The interpreter then performs several steps:
-
Parsing: The interpreter reads through the source code and breaks it down into meaningful tokens. Comments and whitespace are discarded.
-
Compilation: The interpreter compiles the tokens into Python bytecode, which is a lower-level, platform-independent representation.
-
Execution: The Python virtual machine (PVM) executes the bytecode line-by-line. As it encounters definitions, it stores them in memory for later use.
To illustrate, let’s walk through a simple Python program:
# multiply.py
def multiply(a, b):
return a * b
print(multiply(3, 5))
When executing this code, the interpreter will:
-
Parse the source code into tokens like
def
,multiply
,(
,)
,return
, etc. -
Compile the tokens into Python bytecode, which might look something like:
LOAD_NAME # load the name 'multiply'
LOAD_FAST # load first arg
LOAD_FAST # load second arg
BINARY_MULTIPLY # multiply top two stack items
RETURN_VALUE # return value from function
LOAD_NAME # load builtin 'print'
LOAD_NAME # load name 'multiply'
LOAD_CONST # load constant 3
LOAD_CONST # load constant 5
CALL_FUNCTION # call function with two args
PRINT_ITEM
PRINT_NEWLINE
- The PVM will then execute these bytecode instructions to print the result
15
.
The key takeaway is that Python interprets and executes code line-by-line rather than after a separate compilation step.
Pros and Cons of Interpreted Languages
Interpreted languages like Python provide advantages as well as disadvantages compared to compiled languages:
Pros:
-
No separate compilation step allows for very fast development cycles. Code can be tested immediately after changes.
-
Interpreters are platform-independent. The same Python interpreter and source code can run on Windows, Mac, Linux, etc.
-
Interpreted code is highly portable since it does not rely on specific machine instructions.
-
The line-by-line execution makes debugging easier and allows interactive testing.
-
Interpreters execute code directly without hardware access, improving security and system integrity.
Cons:
-
Interpretation adds slight runtime overhead compared to compiled machine code, which directly uses CPU instructions.
-
Interpreted programs are generally slower than compiled equivalents, especially for math/compute-heavy code.
-
Execution always requires the interpreter, unlike standalone executable binaries from compiled code.
-
Interpreters must handle runtime issues like type checking that compiled code resolves at compile-time.
-
Implementation depends heavily on the quality and optimization of the interpreter itself.
Whether compiled or interpreted is better depends on the specific use case, performance requirements, and tradeoffs needed. Python makes an excellent choice for an interpreted, high-productivity general purpose language.
Python Interpreter Implementations
There are two main Python interpreter implementations:
-
CPython - The standard interpreter used on Windows, Mac, Linux, etc. It is written in C for performance and is the most compatible with Python packages.
-
PyPy - An alternative interpreter written in Python. It features a JIT compiler for faster execution speed at the cost of some compatibility issues.
Most Python users will interact with CPython, the standard implementation. But PyPy illustrates that even an interpreted language can utilize some compilation techniques like JIT to boost performance.
Examples and Practical Applications
Let’s now look at some code examples to see how we can apply concepts covered in this guide.
Checking Interpreter Version
We can check which Python interpreter is being used and its version:
import sys
print(sys.version)
print(sys.executable)
Example output:
3.8.2 (default, Feb 24 2020, 21:24:31)
[GCC 9.2.1 20191025]
/usr/local/bin/python3
This indicates we are running CPython 3.8.2 at the specified executable path.
Interpreter-Specific Features
Some Python features only work with the CPython interpreter:
import ctypes
# Fetch CPython C API function only in CPython
getpid = ctypes.CDLL('libc.so.6').getpid
print(getpid())
Attempting to run this on PyPy would result in an error since PyPy does not provide access to CPython’s C APIs.
Bytecode Inspection
We can inspect the bytecode generated by the interpreter using the dis
module:
import dis
def multiply(a, b):
return a * b
print(multiply(6, 7))
dis.dis(multiply)
This outputs the bytecode instructions compiled from our function:
2 0 LOAD_FAST 0 (a)
2 LOAD_FAST 1 (b)
4 BINARY_MULTIPLY
6 RETURN_VALUE
Inspecting bytecode can help better understand what the Python interpreter is doing under the hood.
Performance Testing
We can time interpreters to compare performance differences:
import timeit
pycode = "a = 1; b = 2; c = a + b"
cpython_time = timeit.timeit(pycode, number=100000)
print(f"CPython Time: {cpython_time}")
if sys.implementation.name == 'pypy':
pypy_time = timeit.timeit(pycode, number=100000)
print(f"PyPy Time: {pypy_time}")
On my system, CPython took 0.07s while PyPy only took 0.02s to run 100,000 iterations of the code due to the JIT compiler speed boost.
Building a Custom Interpreter or Compiler
The ast
module lets us parse Python code into an abstract syntax tree (AST). We could use this AST to build our own custom interpreter, code analyzer, optimizer, or even a Python compiler.
import ast
code = """
def fib(n):
if n <= 1:
return n
else:
return fib(n-1) + fib(n-2)
"""
# Parse into an AST
tree = ast.parse(code)
# Output AST in human readable format
print(ast.dump(tree))
This AST output could be translated into bytecode or machine code by walking through the tree nodes. The sky is the limit for developing custom Python execution tools!
Conclusion
Interpreted languages offer many advantages for general purpose programming situations where developer productivity and fast iteration are important. Python is an excellent interpreted language choice due to its high-level syntax, dynamic typing, extensive libraries, and vast community support.
Understanding how the Python interpreter works provides deeper insight into the language. We looked at differences between compiled and interpreted programs, broke down the Python interpreter steps, examined pros and cons, inspected bytecode, and calculated performance.
While compilation and interpretation have tradeoffs, both are useful paradigms. Python makes an outstanding interpreted language for writing clear, concise, and maintainable code across fields like web development, data science, machine learning, and beyond. Whether building prototypes, processing big data, automating workflows, or deploying to production, Python has cemented itself as a ubiquitous interpreted language.