Skip to content

NumPy: Changing Array Shape with Flatten, Ravel, Reshape and Transpose

Updated: at 05:25 AM

NumPy is a fundamental package for scientific computing in Python that provides support for multi-dimensional arrays and matrices. One of the key features of NumPy is the ability to manipulate the shape and dimensions of arrays, allowing you to reshape and restructure array data for different applications without copying any data.

Some common array manipulation methods in NumPy include flatten(), ravel(), reshape(), and transpose(). Understanding how to leverage these functions to alter array shapes can help make your NumPy code more efficient and flexible.

In this comprehensive guide, we will cover the following topics:

Table of Contents

Open Table of Contents

NumPy Arrays

NumPy arrays are the main data structure used in the NumPy library. Unlike Python’s built-in lists, NumPy arrays are homogenous in nature, meaning all elements in the array must be of the same data type.

NumPy arrays also support vectorized operations that allow you to perform computations on entire arrays without writing explicit for-loops. This makes NumPy arrays much faster and more efficient for numerical and scientific computing tasks.

Some key properties of NumPy arrays:

Here is a simple example to create a NumPy array:

import numpy as np

arr = np.array([1, 2, 3, 4])
print(arr)

# Output: [1 2 3 4]

We pass in a Python list as input, and NumPy automatically creates a 1D array with homogeneous integer data type.

1D Arrays vs 2D+ Arrays

When working with NumPy, it’s important to understand the distinction between 1D and multi-dimensional arrays.

1D arrays are simple linear sequences of data. You can think of them like lists or vectors in linear algebra.

2D and higher dimension arrays are matrices - they have both rows and columns.

Some examples:

# 1D array
arr = np.array([1, 2, 3])

# 2D array
arr = np.array([[1, 2, 3], [4, 5, 6]])

# 3D array
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

Why does this matter?

When using NumPy’s shaping functions like flatten(), reshape() etc., 1D arrays behave differently compared to arrays of 2D or higher dimensions.

We’ll explore these differences in detail in the following sections.

Flatten Multidimensional Arrays with flatten()

flatten() is a useful NumPy function that collapses a multidimensional array into a 1D array. Consider this example:

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr.flatten())

# Output: [1 2 3 4 5 6]

Our 2D array arr is flattened into a 1D array with all the elements concatenated.

The key properties of flatten() are:

Here are some examples to demonstrate these properties:

# Flatten 2D array
arr = np.array([[1, 2, 3], [4, 5, 6]])
flattened = arr.flatten()

print(arr.shape) # (2, 3)
print(flattened.shape) # (6,)

# Flatten 3D array
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
flattened = arr.flatten()

print(flattened)
# [1 2 3 4 5 6 7 8]

# Flatten by column
arr = np.array([[1, 2, 3], [4, 5, 6]])
flattened = arr.flatten(order='F')

print(flattened)
# [1 4 2 5 3 6]

# Flattening 1D array throws error
arr = np.array([1, 2, 3, 4])
flattened = arr.flatten()
# ValueError: flatten requires a 2D array

As you can see, flatten() is very convenient for collapsing the multidimensional structure of NumPy arrays into a simple 1D sequence for further processing.

Ravel Multidimensional Arrays into 1D

ravel() is similar to flatten() - it collapses a multidimensional array into a 1D array. However, there are some key differences:

Let’s look at some examples:

# 2D array
arr = np.array([[1, 2, 3], [4, 5, 6]])
flattened = arr.ravel()

print(flattened)
# [1 2 3 4 5 6]

# Ravel 1D array
arr = np.array([1, 2, 3, 4])
flattened = arr.ravel()

print(flattened)
# [1 2 3 4]

# Check if view of original
arr = np.array([[1, 2, 3], [4, 5, 6]])
flattened = arr.ravel()

flattened[0] = 100
print(arr)
# [[100 2 3] [4 5 6]]

Unlike flatten(), ravel() works on both 1D & 2D arrays and returns a view. This avoids unnecessary data duplication in memory.

Reshape Arrays

The reshape() method allows you to change the shape of a NumPy array without changing its data. It essentially allocates the elements of an array into a new shape.

For example:

arr = np.array([1, 2, 3, 4, 5, 6])

reshaped = arr.reshape(2, 3)

print(reshaped)
# [[1 2 3]
# [4 5 6]]

Here our 1D array with 6 elements is reshaped into a 2D array with 2 rows and 3 columns.

Some key properties of the reshape() method:

Let’s look at more examples:

arr = np.array([1, 2, 3, 4, 5, 6])

# Reshape into 2D
reshaped = arr.reshape(2, 3)

# Reshape into 3D
arr.reshape(2, 1, 3)

# Unknown dimension size
arr.reshape(3, -1)

# Flattened view
arr.reshape(-1)

# Reshape 1D into 2D
arr = np.array([1, 2, 3])
arr.reshape(1, 3)

# Invalid reshape
arr.reshape(2, 4) # ValueError

By leveraging reshape(), you can restructure your NumPy arrays for passing into machine learning models or formatting plots and visualizations without altering the underlying data.

Transpose Array Dimensions

The transpose() function permutes the dimensions of a NumPy array. It returns a view of the original array with axes transposed.

For example, let’s transpose a 2D matrix:

arr = np.array([[1, 2, 3], [4, 5, 6]])

transposed = arr.transpose()

print(transposed)
# [[1 4]
# [2 5]
# [3 6]]

Here the row and column indices are swapped.

For multidimensional arrays, you can specify the permutation order as tuple:

arr = np.random.rand(2, 3, 4)

transposed = arr.transpose((1, 0, 2))

This will transpose the 0th and 1st axes.

Some properties of transpose():

Overall, transpose() gives you a convenient way to manipulate the dimensions of arrays for computational and visualization purposes.

Practical Examples

Now let’s look at some practical examples of how you can leverage array shaping functions in real-world scenarios.

Preprocessing Machine Learning Data

When preparing data for machine learning models, you often need to manipulate array shapes:

# Feature data
X = np.random.rand(100, 28, 28)  # 100 samples of 28x28 images

# Reshape into vector
X = X.reshape(100, 784) # For MLP model

# Target data
y = np.random.randint(0, 10, 100) # Labels 0-9

# Reshape into matrix
y = y.reshape(100, 1) # For Keras matrix input

Here we reshape arrays into the required formats for feeding into neural network-based models.

Plotting Multidimensional Data

Matplotlib plotting functions often require 1D arrays. You can use ravel() or flatten() to transform 2D data:

import matplotlib.pyplot as plt

arr = np.random.normal(0, 1, (50, 2)) # 2D data

# Plot each column
plt.plot(arr[:,0].flatten(), arr[:,1].flatten(), 'o')

plt.xlabel('x')
plt.ylabel('y')
plt.show()

This ravels the 2D points into 1D vectors for easy plotting.

Transposing Images

With image data in NumPy, you may need to transpose dimensions for visualization or to match expected model input formats:

from PIL import Image

img = Image.open('image.jpg')

arr = np.asarray(img)
print(arr.shape) # (400, 600, 3)

# Transpose for plotting
transposed = arr.transpose(1, 0, 2)

# Load transposed into PIL Image
img2 = Image.fromarray(transposed)
img2.show()

This switches the order of width and height for displaying the image.

As you can see, NumPy’s shaping functions really help simplify manipulating array data for your applications.

Common Errors and Solutions

Here some common errors that can occur when reshaping arrays and how to fix them:

ValueError: total size of new array must be unchanged

This occurs when trying to reshape into an incompatible shape. Double check that the new shape matches the total number of elements.

AttributeError: ‘list’ object has no attribute ‘reshape’

Lists do not have a reshape() method. Make sure to convert the list to a NumPy array first before reshaping.

TypeError: Expected 2D array, got 1D array instead

Certain methods like flatten() require 2D array input. For 1D arrays, use ravel() or reshape(-1) instead.

ValueError: axes don’t match array

Occurs during transpose when the specified axes are invalid for the array dimensions. Double check the axes numbers match the array shape.

MemoryError

Large arrays may fail to reshape or transpose due to insufficient memory. Try reshaping in smaller chunks or look at optimizing your overall memory usage.

Carefully handling errors and exceptions will help you debug shape manipulation issues more efficiently.

Conclusion

In this guide, we covered several essential methods in NumPy for manipulating array shapes:

Learning how to leverage these functions allows you to restructure arrays for computations and modeling without altering the underlying data.

We also looked at real-world examples like preparing data for machine learning and plotting, where reshaping arrays is extremely useful. Debugging common exceptions and errors will help you resolve problems faster.

NumPy’s array manipulation capabilities make it a versatile tool for data science, visualization, and scientific computing. Mastering array shaping unlocks more of its potential.

Hopefully this guide provided a comprehensive overview of changing array shapes in NumPy. Thanks for reading!