Skip to content

NumPy: Reshaping, Flattening and Transposing Arrays in Python

Updated: at 05:12 AM

NumPy is a fundamental Python package for scientific computing and data analysis. It provides support for multi-dimensional arrays and matrices along with a large library of high-level mathematical functions to operate on these arrays. NumPy arrays enable efficient implementation of numerical operations compared to the basic Python lists.

One of the key features of NumPy is the ability to reshape, flatten and transpose arrays without duplicating the data. These functionalities allow us to manipulate the structure and dimensions of arrays to suit our use cases. In this comprehensive guide, we will learn how to leverage these techniques for effective data analysis and modeling in Python.

Table of Contents

Open Table of Contents

Overview

NumPy arrays have an inherent dimensionality defined during creation. The shape attribute returns a tuple with the length of each dimension. For example, a 3x4 array will have shape (3,4).

Reshaping allows us to change the dimensions of the array without changing its data. We can convert a 1D array to 2D, or a 2D array to 3D. NumPy makes this operation efficient by reinterpreting the underlying data buffer without duplication.

Flattening reduces the array into one single dimension. We can flatten any multidimensional array into a 1D array for certain computations or representations.

Transposing exchanges the rows and columns, providing a rotated view of the original data. Transposes allow us to reorder the axes for plotting, visualization or to align arrays for computational purposes.

Let’s look at how each operation works in detail with examples.

Reshaping Arrays

The reshape method allows reshaping an array into a new shape with the same number of elements. It takes a tuple specifying the new shape and returns a new view of the original array with the given shape.

import numpy as np

arr = np.arange(8)
print(arr)
# [0 1 2 3 4 5 6 7]

arr = arr.reshape(4,2)
print(arr)
'''
[[0 1]
 [2 3]
 [4 5]
 [6 7]]
'''

We reshaped the 1D array into a 4x2 2D array. The number of elements match in both arrays.

To confirm the reshape created a new view, modifying one array doesn’t change the other:

arr[0,0] = 100
print(arr)
'''
[[100   1]
 [  2   3]
 [  4   5]
 [  6   7]]
'''

print(orig_arr)
# [0 1 2 3 4 5 6 7] (unchanged)

We can also infer one of the dimensions based on the length of the array:

arr = np.arange(15)
arr = arr.reshape(3, -1)
print(arr.shape) # (3, 5)

arr = arr.reshape(5, -1)
print(arr.shape) # (5, 3)

Multidimensional arrays can also be reshaped. For example, reshaping a 3D into a 2D array:

arr_3d = np.arange(24).reshape(2, 3, 4)

arr_2d = arr_3d.reshape(6, 4)
print(arr_2d.shape) # (6, 4)

Reshape Exceptions

Reshaping will throw errors in case the total number of elements differs between shapes:

arr = np.arange(8)
arr = arr.reshape(3,3) # ValueError due to mismatch

We can also get a TypeError if the new shape is not a tuple of ints:

arr.reshape('abc') # TypeError due to invalid shape

Flattening Arrays

Flattening reduces an array of any dimensionality into a simple 1D array. We can use the flatten method to flatten an array:

arr_2d = np.array([[1,2], [3,4]])

flatten = arr_2d.flatten()
print(flatten) # [1 2 3 4]

The array is flattened row-wise into the 1D result.

For multidimensional arrays, each sub-array is appended to the result sequentially:

arr_3d = np.array([[[1,2],[3,4]], [[5,6],[7,8]]])
flattened = arr_3d.flatten()

print(flattened)
# [1 2 3 4 5 6 7 8]

The default order='C' flattens the array row-wise. We can also flatten column-wise with order='F':

arr = np.array([[1,2,3], [4,5,6]])

print(arr.flatten()) # [1 2 3 4 5 6]
print(arr.flatten(order='F')) # [1 4 2 5 3 6]

The flattened array does not create a copied buffer, it is a new view of the same memory space. Updating the flattened view will modify the original array:

arr_2d = np.zeros((2, 3))
flat_arr = arr_2d.flatten()

flat_arr[0] = 5
print(arr_2d)
# [[5. 0. 0.]
# [0. 0. 0.]]

We can also use the ravel() method to flatten the array. The only difference is that ravel() returns a reference to the original array if possible. So modifying the raveled view can change the original, whereas flattening always creates a view.

arr_2d = np.zeros((2, 3))
raveled = arr_2d.ravel()

raveled[0] = 5
print(arr_2d)
# [[5. 0. 0.]
# [0. 0. 0.]]

Flatten Exceptions

The flatten method doesn’t take any input arguments. Providing an invalid order value will result in a ValueError:

arr = np.arange(6).reshape(2,3)
flattened = arr.flatten(order='G') # ValueError invalid order

Transposing Arrays

Transposing exchanges the rows and columns of a 2D array or swaps the axes for multidimensional arrays.

The transpose method transpose a matrix:

arr = np.arange(6).reshape(2,3)

print(arr)
'''
[[0 1 2]
 [3 4 5]]
'''

print(arr.transpose())
'''
[[0 3]
 [1 4]
 [2 5]]
'''

For a multidimensional array, we can specify the sequence of axis swapping as an input parameter:

arr = np.arange(24).reshape(2, 3, 4)

print(arr.transpose((1, 0, 2)).shape) # (3, 2, 4)
print(arr.transpose((2, 0, 1)).shape) # (4, 2, 3)

Transposing doesn’t allocate any additional memory for the array. It returns a new view by reordering the strides of the given axes.

We can also access the property T as a shorthand for getting the transpose:

arr = np.ones((3,2))
print (arr.T)
# [[1. 1.]
# [1. 1.]
# [1. 1.]]

Transpose Exceptions

The axis indices passed to transpose should be a valid permutation of the array’s axes. Any repeats or out of bounds values will raise an error:

arr = np.arange(6).reshape(2,3)

arr.transpose((1,2,0)) # AxisError
arr.transpose((1,1,0)) # Repeated axis in transpose

Real World Examples

Let’s look at some examples of how these techniques are applied in real-world scenarios:

Image Processing

Multidimensional arrays are commonly used in image processing. We often need to restructure the pixel arrays for filtering, visualization or compression algorithms:

image = skimage.io.imread('image.jpg')

# Transpose for matplotlib
plt.imshow(image.transpose(1,0,2))

# Flatten into 1D for learning algorithm
image = image.flatten()

Machine Learning

Reshaping data is often required to feed inputs to machine learning models in the required multidimensional format:

import tensorflow as tf

dataset = tf.keras.datasets.mnist

(X_train, y_train),(X_test, y_test) = dataset.load_data()
X_train = X_train.reshape(-1, 28, 28, 1)
X_test = X_test.reshape(-1, 28, 28, 1)

# Create convolutional neural network
model = tf.keras.models.Sequential()

Transposes help align multidimensional data, for example transposing images for use in a CNN:

X_train = X_train.transpose(0, 3, 1, 2)
X_test = X_test.transpose(0, 3, 1, 2)

Aggregate Statistics

Flattening can be useful to compute overall statistics for a multidimensional array:

stats = np.arange(24).reshape(4,3,2)

# Average of all array elements
print(stats.flatten().mean())

# Standard deviation
print(stats.flatten().std())

Best Practices

Here are some recommendations for working with array reshaping, flattening and transposing:

Conclusion

In this guide, we looked at how to leverage NumPy’s reshaping, flattening and transposing to manipulate array dimensions for data analysis and modeling tasks in Python.

Key takeaways include:

With the foundation on how to reshape, flatten and transpose arrays in NumPy, you can leverage these techniques to structure and transform array data effectively for your Python projects.