Skip to content

A Comprehensive Guide to Contrasting Matplotlib with Other Python Plotting Libraries

Updated: at 03:26 AM

Data visualization is an integral part of data analysis and machine learning. Being able to create meaningful plots and charts from data allows analysts to easily interpret trends, patterns, and relationships in the data. Python has emerged as one of the most popular programming languages for data analysis due to its extensive ecosystem of data science libraries.

When it comes to data visualization and plotting in Python, Matplotlib is undoubtedly the most widely used library. Created by John Hunter in the early 2000s, Matplotlib provides a MATLAB-style plotting framework that enables users to generate publication-quality figures and plots with just a few lines of code. However, over the past decade, several new specialized plotting libraries have been developed as alternatives to Matplotlib, each with its own strengths and weaknesses.

In this comprehensive guide, we will contrast Matplotlib with three of the most popular alternative Python plotting libraries - Seaborn, Plotly, and Bokeh. We will examine the key differences between Matplotlib and these libraries in terms of usage, syntax, features, performance, and use cases. By the end of this guide, you will have a clear understanding of the capabilities of each library and when you may want to use one over the others for your data visualization needs.

Matplotlib Overview

Matplotlib is the grandfather of Python plotting libraries. It provides a comprehensive API for generating a wide variety of 2D plots, charts, and graphs that can be tweaked endlessly to customize the visual output.

Some of the major features of Matplotlib include:

Here is a simple example of creating a line plot with Matplotlib:

import matplotlib.pyplot as plt
import numpy as np

x = np.arange(0, 10, 0.1)
y = np.sin(x)

plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('sin(x)')
plt.title('Simple Line Plot')
plt.grid()
plt.show()

matplotlib-lineplot

This generates a nice looking sine wave plot with just a few lines of code!

Seaborn

Seaborn is a statistical data visualization library built on top of Matplotlib. Created by Michael Waskom in 2012, Seaborn provides a high-level API for creating attractive statistical graphics with Python. Some major features of Seaborn include:

Here is an example of a distplot created with Seaborn, showcasing its styling:

import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset('tips')
sns.distplot(tips['total_bill'], kde=False, bins=20)

plt.xlabel('Total Bill')
plt.ylabel('Frequency')
plt.title('Distribution of Total Bill')

plt.show()

Update (05/17/22): The distplot function has been deprecated in newer versions of seaborn and it recommends using either the displot or histplot function instead. Both have similar functionality, but there are slight differences between them. See here for more details.

The displot is a figure-level function and supersedes distplot, providing access to several different approaches for visualizing the univariate or bivariate distribution of data. histplot is an axes-level function used for plotting histograms, which simplifies the work of customizing the plot as per your needs.

Let’s update our Python code using the histplot function:

import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset('tips')
sns.histplot(tips['total_bill'], kde=False, bins=20)

plt.xlabel('Total Bill')
plt.ylabel('Frequency')
plt.title('Distribution of Total Bill')

plt.show()

seaborn-distplot

The Seaborn plotting API is designed to work intuitively with pandas and NumPy data structures like DataFrames and arrays. It allows users to quickly visualize statistical relationships in data.

Plotly

Plotly is an interactive, browser-based charting library for Python. Some of its major features include:

Here is an example of an interactive scatter plot matrix created with Plotly Express:

import plotly.express as px

df = px.data.iris()
fig = px.scatter_matrix(df, dimensions=["sepal_width", "sepal_length", "petal_width", "petal_length"], color="species")
fig.show()

plotly

This generates a matrix of scatter plots visualizing the multivariate iris flower dataset.

Plotly’s interactive charts allow deeper exploration of data relationships.

Bokeh

Bokeh is a Python library for creating interactive data visualizations and dashboards in web browsers. Its key features include:

Here is a simple example of an interactive sine wave plot created with Bokeh:

from bokeh.plotting import figure, output_file, show
from bokeh.layouts import column
from math import sin, pi
import numpy as np

output_file("sine.html")

x = np.arange(-2*pi, 2*pi, 0.1)
y = [sin(i) for i in x]

p = figure(title="Sine Wave Example")
p.line(x, y)

show(column(p))

bokeh

This generates an interactive plot that can be panned, zoomed, and saved.

Bokeh allows building rich interactive data apps and dashboards for the web.

Key Differences

Now that we have looked at some examples of using Matplotlib, Seaborn, Plotly, and Bokeh, let us examine some of the key differences between these Python data visualization libraries:

1. Syntax and Ease of Use

2. Level of Control

3. Plot Customization

4. Data Structures

5. Visual Aesthetics

6. Plot Types and Functionality

7. Interactivity

8. Large Datasets and Performance

9. Environment and Sharing

When to Use Each Library

Based on their various capabilities, here are some recommendations on when you may want to use Matplotlib, Seaborn, Plotly or Bokeh for your visualization needs:

Of course, many times you can use these libraries together to take advantage of their complementary strengths. For example, you may use Matplotlib for initial analysis and prototyping, then switch to Bokeh or Plotly for building interactive web-based visualizations. Or use Seaborn on top of Matplotlib to improve the styling of statistical plots.

Knowing the key features and differences between these libraries will allow you to pick the right tool for your data visualization needs. The Python data science ecosystem provides a wealth of options for both exploratory analysis and production-quality graphics.

Example Usage Scenarios

To further illustrate when you may want to use each library, let’s look at some real-world examples and usage scenarios:

Matplotlib Usage

Desmond is a research scientist studying wind turbine data to build predictive maintenance models. For his publications, he needs to create high-quality line and scatter plots showing turbine sensor metrics over time. Matplotlib is the best choice here as Desmond can fully customize colors, labels, legend, and styling to prepare publication-ready figures.

Seaborn Usage

Gladilyn is a data analyst exploring the statistical relationship between different attributes in a housing dataset to determine pricing trends. She wants to quickly generate some attractive histograms, heatmaps, and regression plots to understand the distributions and correlations. Seaborn allows Gladilyn to easily create these statistical visualizations with good defaults.

Plotly Usage

Junell is a data scientist who needs to analyze results from an A/B test conducted on a website. He has to present his findings to the product team in an interactive demo. Plotly helps Junell create linked bar charts showing conversion rates, funnel charts showing customer drop off rates, and other web analytics plots to showcase in an interactive dashboard.

Bokeh Usage

Argi is developing a real-time data monitoring system for an IoT fleet management application. She needs to build a dynamic dashboard that shows streaming telemetry data like vehicle sensors, geolocation, diagnostics, etc. Bokeh is the perfect fit allowing Argi to stream data to interactive plots and maps updated in real-time.

As you can see, each library is better suited for certain use cases based on their capabilities and the context of the problem at hand.

Conclusion

In this comprehensive guide, we explored Matplotlib and contrasted it with alternative Python plotting libraries - Seaborn, Plotly, and Bokeh. We looked at the key features of each library along with example usage. We also examined differences in syntax, functionality, customization options, interactivity, and performance. Finally, we covered real-world scenarios to help illustrate when you may want to select a particular visualization library.

Matplotlib remains an extremely versatile plotting package for Python and provides a solid foundation for other libraries like Seaborn to build upon. However, for statistical plotting, interactive visualization, and building data apps and dashboards, Seaborn, Plotly and Bokeh are compelling choices with their own strengths. As a data scientist, knowing how and when to use each of these Python data visualization tools will enable you to create meaningful graphics and derive the most value out of your data.