Skip to content

Best Practices Beautiful Matplotlib Visualizations Python

Updated: at 05:15 AM

Effective data visualization is key to extracting insights from data. Matplotlib is one of the most popular Python libraries used for data visualization and plotting in Python. With Matplotlib, you can create a wide variety of graphs, charts, histograms, heatmaps, and more. However, creating visually appealing plots requires following principles of visual design, color theory, and perceptual science. This guide will provide best practices and techniques for creating beautiful and engaging Matplotlib visualizations in Python.

Table of Contents

Open Table of Contents

Introduction

Matplotlib is a comprehensive 2D plotting library that produces publication-quality figures in Python. It provides an object-oriented API that allows you to build plots piece by piece programmatically. Matplotlib can generate simple plots with just a few commands as well as complex multi-plot grids with complete control over each axis.

To make the most out of Matplotlib, it is important to learn design principles that make visualizations easier to understand. Well-designed plots effectively convey key information, patterns, or insights from the data. Poorly designed visuals can mislead, hide vital data points, or simply bore the audience. By following best practices and techniques outlined in this guide, you can create beautiful, perceptually effective, and engaging Matplotlib visualizations.

Principles of Effective Visualization

Before diving into Matplotlib specifics, let’s review some key principles from visual design, color theory, and perception that form the foundation of good visualization:

Focus on Relevance

Only include plot elements that highlight something meaningful in the data. Decorative elements or unnecessary details distract from the core story and insights.

Eliminate Chartjunk

Avoid visual clutter like excessive grid lines, borders, legends, or redundant labels that don’t add value. Decluttered plots are easier to interpret.

Choose Appropriate Visual Encodings

Pick visual encodings like position, size, shape, and color that match the type of data being displayed. This makes patterns readily perceptible.

Facilitate Comparisons

Use consistent scales, common baselines, and alignment to make it easy to compare plotted quantities.

Use Judicious Contrast

Leverage contrasting colors, sizes, positions, etc. to distinctly highlight key elements, differences, or trends.

Direct Attention

Draw focus to important plot areas with techniques like reduced opacity in unimportant regions.

Balance Aesthetics and Clarity

Make sensible aesthetic choices that enhance clarity rather than detract from it. Customize Matplotlib’s default parameters only when suitable.

Choose Color Palettes Wisely

Use appropriate categorical, sequential, or diverging color schemes based on data characteristics and visualization goals.

Accommodate Color Vision Deficiency

Around 4% of the population has color vision deficiencies. Ensure visuals remain interpretable when viewed in grayscale.

By keeping these core principles in mind when working with Matplotlib, you can make informative plots that engage audiences. Now let’s look at how to put some of these ideas into practice.

Matplotlib Plot Customization

Matplotlib provides extensive control over all aspects of a plot through its configurations and parameters. Customizing the default Matplotlib styles and components appropriately can greatly improve clarity and aesthetics.

Set Figure and Axes Parameters

The Figure and Axes objects control overall plot area and coordinate space respectively. We can set properties like background color, size, resolution and layout:

import matplotlib.pyplot as plt

fig = plt.figure(figsize=(8, 5), dpi=120, facecolor='lightgrey')

ax = fig.add_axes([0.15, 0.15, 0.8, 0.8],
                  facecolor='#eafff5',
                  xlim=(0, 100),
                  ylim=(-5, 105))

Customize Axis Ticks and Labels

Ticks and labels should be legible without cluttering the plot. We can configure them as follows:

import numpy as np

ax.tick_params(axis='both', direction='in', length=6, width=1.5,
               labelsize=14, pad=8)

ax.set_xticks(np.arange(0, 110, 10))
ax.set_yticks(np.arange(-5, 110, 10))

ax.set_xticklabels(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
                    'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'],
                   rotation=30,
                   horizontalalignment='right')

Add Informative Plot Titles and Axis Labels

Descriptive titles and labels provide context for the visualization:

ax.set_title('Average Monthly Temperature in Boston',
             fontsize=18, pad=20)

ax.set_xlabel('Month', labelpad=20)
ax.set_ylabel('Temperature (°F)', labelpad=20)

Set Legend Parameters

Legends can be positioned outside plots and configured with:

legend = ax.legend(loc='upper center',
                   bbox_to_anchor=(0.5, 1.15),
                   ncol=3,
                   fontsize=12,
                   framealpha=1,
                   facecolor='w',
                   frameon=True)

Save High-Resolution Figure

Use:

fig.savefig('plot.png', bbox_inches='tight', dpi=300)

to export the figure to a pixel-dense PNG for publications or presentations.

By tweaking these and other figure elements methodically, you can customize Matplotlib’s defaults into more effective visualizations.

Choosing Color Schemes

Color is a key visual encoding in plots for distinguishing groups, highlighting trends, or attracting attention. Matplotlib has several built-in color palettes and tools for choosing colors.

Categorical Color Maps

For qualitative data like categories, use categorical color schemes with distinct hues:

import matplotlib.pyplot as plt

categories = ['A', 'B', 'C', 'D']
colors = plt.get_cmap('Set1')(np.linspace(0, 1, len(categories)))

Sequential Color Maps

Sequential schemes are useful for numeric data with a natural ordering like magnitudes. We can utilize the full palette or subplot:

values = [0.2, 0.5, 0.3, 0.8, 1.0]

viridis = plt.get_cmap('viridis')
colors = viridis(values)

# show part of colormap
viridis(np.linspace(0, 0.7, 8))

Diverging Color Maps

For data with a meaningful center point, like deviations, use diverging color schemes:

div_cmap = plt.get_cmap('RdBu', 11)

normalized_values = [-1.0, -0.6, -0.3, -0.1,
                    0.0,
                    0.1, 0.3, 0.6, 1.0]

colors = div_cmap(normalized_values)

Avoiding Poor Color Choices

Don’t use:

Instead, pick color palettes consciously based on data characteristics and visual goals. Resources like ColorBrewer can help select suitable schemes.

Visualizing Different Data Types

Now let’s look at how to build various Matplotlib plot types for specific data characteristics using best practices covered so far:

Line Plots

Ideal for visualizing relationships and trends over a continuous variable. For example, stock prices over time:

import numpy as np
import matplotlib.pyplot as plt

x = np.arange('2020-01', '2020-12', dtype='datetime64[M]')
y = np.random.randn(len(x)).cumsum() + 15

fig, ax = plt.subplots(figsize=(10, 5))

ax.plot(x, y, linestyle='-', marker='o', linewidth=2, markersize=5)

ax.set(title="Stock Price Over Time",
       xlabel="Date",
       ylabel="Price ($)")

Bar Charts

Bars effectively compare categorical data. For instance, sales by product category:

import matplotlib.pyplot as plt

categories = ['A', 'B', 'C', 'D']
sales = [16000, 14000, 17500, 19500]

fig, ax = plt.subplots(figsize=(6, 5))

ax.bar(categories, sales, width=0.5, edgecolor="grey", linewidth=0.7)

ax.set(xlabel="Product Category",
       ylabel="Quarterly Sales",
       title="Sales by Product Category")

Scatter Plots

Scatterplots reveal relationships and patterns between two numeric variables. We can visualize the correlation between quiz scores and exam grades as follows:

import numpy as np
import matplotlib.pyplot as plt

x = np.random.normal(70, 15, 50)
y = np.random.normal(75, 20, 50)

fig, ax = plt.subplots(figsize=(6,6))

ax.scatter(x, y, s=75, c='steelblue',
           edgecolor='white', linewidth=1.5)

ax.set(xlabel='Quiz Scores', ylabel='Exam Grades',
       xlim=(0,100), ylim=(0,100))

Histograms

Histograms show frequency distributions of numeric data. We can plot an age distribution as:

import numpy as np
import matplotlib.pyplot as plt

ages = np.random.normal(45, 15, 500)

fig, ax = plt.subplots(figsize=(8, 5))

ax.hist(ages, bins=20, edgecolor='black', linewidth=1.2)

ax.set(xlabel='Age', ylabel='Frequency',
       title='Age Distribution')

Heatmaps

Heatmaps convey magnitude variations through color intensity. We can plot employee performance on a 5-point scale as:

import numpy as np
import matplotlib.pyplot as plt

data = np.random.randint(1, 6, size=(10, 5))
norm = plt.Normalize(1,5)

fig, ax = plt.subplots(figsize=(6, 5))

cmap = plt.get_cmap('Reds')
heat_map = ax.imshow(data, cmap=cmap, norm=norm)

cbar = fig.colorbar(heat_map)

ax.set_yticklabels(range(1,6))
ax.set_xticklabels(range(1,11))

By applying principles covered to various plot types needed for your data, you can build insightful and aesthetically-pleasing visualizations.

Real-World Example

Let’s take an example of visualizing hotel booking data to illustrate several best practices in action.

We will create a line chart comparing daily bookings for hotels A, B, and C over 2 months:

import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

np.random.seed(1)

hotels = ['A', 'B', 'C']
bookings = np.random.randint(20, 100, (60, 3)).cumsum(axis=0)
dates = np.arange('2020-01', '2020-03', dtype='datetime64[D]')

fig, ax = plt.subplots(figsize=(10, 5))

colors = plt.get_cmap('Dark2')(np.linspace(0, 1, 3))

for i in range(3):
    ax.plot(dates, bookings[:,i], marker='o',
            linestyle='-', linewidth=2,
            color=colors[i], label=hotels[i])

ax.set(title="Daily Hotel Bookings",
       xlabel="Date",
       ylabel="Bookings")

ax.legend(loc="upper left", ncol=1, frameon=True,
          facecolor='lightgrey', framealpha=0.7,
          fontsize=12)

fig.savefig("hotel_bookings.png", dpi=300,
            bbox_inches = 'tight')

Hotel Bookings

This code applies several principles:

The resulting visualization summarizes daily bookings trends for the hotels efficiently and aesthetically.

Conclusion

Creating insightful and engaging data visualizations requires both Matplotlib coding skills as well as design expertise. By leveraging principles from visual perception, color theory, and design, you can build beautiful and effective Matplotlib plots that clearly convey key information.

The best practices outlined in this guide form a starting point for developing good visualization habits. As you gain more Matplotlib experience, also study real-world examples and learn from experts to further refine your approach. Aesthetics and functionality work together in great visualizations. Leveraging them thoughtfully will enable you to transform raw data into impactful data stories using Python’s powerful Matplotlib library.