Skip to content

NumPy: Tiling and Repeating Arrays

Updated: at 02:26 AM

NumPy is a popular Python library used for scientific computing and working with multidimensional array data. It provides powerful tools to efficiently perform operations on arrays, including the ability to tile and repeat array elements in various ways. Tiling and repeating arrays are useful techniques in data analysis, machine learning, image processing, and other domains. This guide will provide an in-depth look at tiling and repeating arrays using NumPy.

Table of Contents

Open Table of Contents

Overview of Tiling and Repeating Arrays

Tiling refers to splitting an array into smaller sub-arrays or tiles. For example, splitting a 2D array into quadrants or splitting a 1D array into chunks of a specified size. Tiling enables operating on blocks of an array independently, which can improve performance for large arrays that don’t fit in memory.

Repeating arrays involves creating copies of array elements, either by duplicating the entire array or repeating individual elements. This is useful for expanding datasets, creating patterned data, or reformatting array shapes.

NumPy provides several functions to tile and repeat NumPy arrays including numpy.tile(), numpy.repeat(), numpy.broadcast_to(), and using advanced indexing with NumPy arrays.

Below are some common use cases for tiling and repeating arrays with NumPy:

Tiling Arrays

Tiling splits an array into smaller equal-sized blocks or tiles. The split is done along an axis, dividing the dimensions of the specified axis into tiles of a given size.

NumPy provides several approaches to tile arrays:

Using numpy.split

The numpy.split function divides an array into multiple sub-arrays along a specified axis.

import numpy as np

arr = np.arange(9).reshape(3,3)

print(arr)
# [[0 1 2]
#  [3 4 5]
#  [6 7 8]]

rows = np.split(arr, 3, axis=0)
print(rows[0])
# [[0 1 2]]

cols = np.split(arr, 3, axis=1)
print(cols[0])
# [[0]
#  [3]
#  [6]]

The numpy.split function takes the array to split, number of splits, and axis to split along. It returns a list of sub-arrays. This provides a simple way to tile an array into blocks.

Using numpy.array_split

The numpy.array_split method works similarly to numpy.split but allows splitting into a specific number of tiles:

import numpy as np

arr = np.arange(9).reshape(3, 3)

tiles = np.array_split(arr, 4)
print(tiles[0])
# [[0 1 2]]

tiles = np.array_split(arr, 4, axis=1)
print(tiles[0])
# [[0]
#  [3]
#  [6]]

numpy.array_split evenly divides the axis into the number of tiles specified. This provides precise control over the tiling.

Using numpy.vsplit and numpy.hsplit

NumPy has convenience methods numpy.vsplit and numpy.hsplit to split along the vertical (0th) and horizontal (1st) axes respectively:

import numpy as np

arr = np.arange(16).reshape(4, 4)

rows = np.vsplit(arr, 2)
cols = np.hsplit(arr, 2)

This splits the 4x4 array into 2 rows and 2 columns.

Using Advanced Indexing

NumPy advanced indexing provides another way to tile arrays by selecting slices along axes:

import numpy as np

arr = np.arange(9).reshape(3, 3)

# Tile into 4 quadrants
quad1 = arr[0:1, 0:1]
quad2 = arr[0:1, 1:2]
quad3 = arr[1:2, 0:1]
quad4 = arr[1:2, 1:2]

# Tile into 4 columns
col1 = arr[:, 0:1]
col2 = arr[:, 1:2]
col3 = arr[:, 2:3]
col4 = arr[:, 3:4]

This splits the array explicitly into sections by selecting slices. Advanced indexing gives you complete control over the tile size and axes to split on.

Using numpy.reshape

The numpy.reshape function can be used to tile arrays by reshaping them into higher dimensional versions:

import numpy as np

arr = np.arange(9)

tiled = np.reshape(arr, (3, 3))
# [[0 1 2]
#  [3 4 5]
#  [6 7 8]]

Reshaping creates tiles by allocating elements from the flattened array into chunks along the new dimensions. This provides a simple tiling mechanism for 1D and 2D arrays.

Tiling Multidimensional Arrays

Tiling higher dimensional arrays follows similar patterns but with splitting or indexing along multiple axes:

import numpy as np

arr_3d = np.arange(8).reshape(2, 2, 2)

# Split along first axis into 2 tiles
tiles = np.array_split(arr_3d, 2)

# Split along second axis into 2 tiles
tiles = np.array_split(arr_3d, 2, axis=1)

# Using advanced indexing
quad1 = arr_3d[:, :1, :1]
quad2 = arr_3d[:, 1:, :1]
quad3 = arr_3d[:, :1, 1:]
quad4 = arr_3d[:, 1:, 1:]

The core tiling techniques like numpy.split, numpy.reshape, and indexing/slicing extend to higher dimensions enabling multidimensional tiling.

Repeating Arrays

NumPy provides a few approaches to repeat or duplicate array elements:

Using numpy.repeat

The numpy.repeat function duplicates individual elements in an array. You specify the number of repetitions for each element:

import numpy as np

arr = np.arange(3)

repeated = np.repeat(arr, 3)
# [0 0 0 1 1 1 2 2 2]

This repeats each element in the 1D array 3 times.

numpy.repeat also works on higher-dimensional arrays:

arr_2d = np.array([[1, 2], [3, 4]])

repeated = np.repeat(arr_2d, 2, axis=1)
# [[1 1 2 2]
#  [3 3 4 4]]

Here it repeats the elements along the second axis resulting in 2 columns per unique column.

Using numpy.tile

The numpy.tile function creates copies of the entire array. You specify a single repetition factor and it duplicates the whole array:

import numpy as np

arr = np.arange(3)

tiled = np.tile(arr, 2)
# [0 1 2 0 1 2]

For 2D arrays, you can provide a repetition factor for each axis:

arr_2d = np.array([[1, 2], [3, 4]])

tiled = np.tile(arr_2d, (2, 3))
"""
[[1 2 1 2 1 2]
 [3 4 3 4 3 4]
 [1 2 1 2 1 2]
 [3 4 3 4 3 4]]
"""

This repeats the outer axis 2 times and inner axis 3 times.

Using numpy.broadcast_to

The numpy.broadcast_to function expands an array to a new shape by duplicating elements as needed:

import numpy as np

arr = np.arange(4)

bcast = np.broadcast_to(arr, (4, 4))
"""
[[0 1 2 3]
 [0 1 2 3]
 [0 1 2 3]
[0 1 2 3]]
"""

Elements are repeated to fill the new shape (4, 4). numpy.broadcast_to is useful for expanding arrays and fitting them to new dimensions.

Using numpy.resize

The numpy.resize function returns a new array with the specified shape and repeats elements to fill it:

import numpy as np

arr = np.arange(6)

arr_resized = np.resize(arr, (3, 3))
"""
[[0 1 2]
 [3 4 5]
[0 1 2]]
"""

Elements are repeated to map the original flattened array to the new multidimensional shape.

Repeating Elements with Indexing

Arrays elements can also be repeated using NumPy indexing by constructing index arrays that duplicate indices:

import numpy as np

arr = np.arange(3)
indices = np.array([0, 0, 1, 2, 2, 2])

repeated = arr[indices]
# [0 0 1 2 2 2]

This selects elements from arr based on the duplicated index values in indices, resulting in repeated elements.

Examples and Applications

Let’s look at some examples applying these techniques for tiling and repeating arrays with NumPy.

Tiling Images

Tiling images into blocks is useful for segmentation and processing parts of an image independently:

import numpy as np
from PIL import Image
import matplotlib.pyplot as plt

img = np.array(Image.open('image.jpg'))

# Tile into 9 blocks
tiles = np.array_split(img, 3, axis=0)
tiles = [np.array_split(x, 3, axis=1) for x in tiles]

# Display first tile
plt.imshow(tiles[0][0])
plt.title("Top Left Quadrant")
plt.show()

Tiling via splitting enables operating on blocks of the image separately, like applying filters or detections to each tile.

Repeating Array Elements

Repeating elements is useful for data augmentation. This example duplicates elements in a 1D array:

import numpy as np

data = np.array([1.5, 2.8, -3.4, 0.2])

# Repeat each element 3 times
aug_data = np.repeat(data, 3)

print(aug_data)
# [ 1.5  1.5  1.5  2.8  2.8  2.8 -3.4 -3.4 -3.4  0.2  0.2  0.2]

Duplicating data like this expands datasets for training machine learning models.

Tiling Multidimensional Arrays

Multidimensional data like time-series images or spectral data cubes can be tiled for block processing:

import numpy as np

vol_data = np.random.rand(10, 200, 300) # Time x W x H

# Split time axis into chunks
tiles = np.array_split(vol_data, 2)

print(tiles[0].shape)
# (5, 200, 300)

This tiles the 3D volume into smaller 3D chunks for parallel processing.

Repeating Array as Column Vector

This example repeats a 1D array to construct a column vector:

import numpy as np

arr = np.array([1, 2, 3])

# Repeat to make (3, 3) column vector
arr_col = np.repeat(arr[:, np.newaxis], 3, axis=0)

print(arr_col)
"""
[[1]
 [2]
 [3]]
"""

numpy.repeat repeats the array along the new axis, creating a column shape.

Performance Considerations

When working with large arrays in NumPy that don’t fit in memory, it is often better to use tiling approaches to process chunks of the array independently. This prevents slow disk swapping or thrashing.

Some performance considerations for tiling include:

For repeating arrays, take advantage of NumPy’s vectorized operations instead of slower Python loops. Use numpy.repeat, numpy.tile, or numpy.broadcast_to rather than manual repetition for better performance.

Also be mindful of memory usage when duplicating array elements to avoid exceeding available RAM.

Conclusion

This guide covered several techniques to tile and repeat arrays using NumPy, including splitting, advanced indexing, reshaping, repeating elements, and duplicating arrays.

Key concepts include:

Tiling and repeating arrays with NumPy provides powerful, vectorized tools for preparing and manipulating array data for analysis and modeling. Check the NumPy documentation for more details on these functions.