Skip to content

An In-Depth Guide to NumPy ndarray Objects in Python

Updated: at 01:44 AM

NumPy’s ndarray object is a powerful N-dimensional array object that enables efficient numerical computing in Python. As one of NumPy’s core data structures, understanding ndarrays is crucial for effective use of the NumPy library and Python programming for scientific computing and data analysis applications. This guide provides a comprehensive overview of NumPy ndarrays, including creation, attributes, indexing and slicing, various operations, broadcasting, array manipulations, comparisons, input/output, and more.

Table of Contents

Open Table of Contents

Introduction to NumPy ndarrays

The NumPy ndarray (N-dimensional array object) is an efficient container for homogeneous data types. Arrays allow vectorized operations that are fast and concise compared to Python lists and tuples. Key attributes of NumPy arrays include:

Below is a simple example of creating a 1-dimensional NumPy array:

import numpy as np

arr = np.array([1, 2, 3])
print(arr)

# Output
[1 2 3]

Numpy arrays provide substantial performance and productivity benefits for computing with numeric data compared to Python lists.

Creating NumPy Arrays

There are several ways to create NumPy arrays:

From Python Lists

Convert Python lists and tuples directly into arrays:

import numpy as np

py_list = [1, 2, 3]
arr = np.array(py_list)

py_tuple = (4, 5, 6)
arr = np.array(py_tuple)

Multidimensional arrays can be created by passing nested Python sequences:

py_matrix = [[1, 2], [3, 4]]
arr = np.array(py_matrix)

# 2D array
print(arr)
[[1 2]
 [3 4]]

With NumPy Functions

Use NumPy functions like np.zeros, np.ones, np.full, np.arange, etc. to create arrays:

np.zeros(2) # 1D array of 2 zeros

np.ones((2, 3)) # 2D array with 2 x 3 ones

np.full((3, 2), 99) # 3x2 array filled with 99

np.arange(5) # 1D array from 0 to 5 (like range)

np.linspace(0, 1, 5) # 1D array of 5 evenly divided values

Reading Arrays From Disk

Build arrays from data in files using np.loadtxt, np.genfromtxt, etc:

arr = np.loadtxt('data.txt')

arr = np.genfromtxt('data.csv', delimiter=',')

Array Attributes

NumPy arrays have various attributes that provide information about the data:

For example:

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr.shape) # (2, 3)
print(arr.dtype) # int64
print(arr.size) # 6
print(arr.ndim) # 2

Other attributes like itemsize, nbytes, etc provide additional details.

Array Indexing and Slicing

NumPy arrays can be indexed and sliced like Python lists, but extended for N dimensions:

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Indexing
print(arr[1, 2]) # 6

# Slicing
print(arr[:, 1]) # [2 5 8]
print(arr[1:3, :]) # 2D array of rows 1 and 2

Important! Array slicing returns a view instead of copy. Changes to the slice also modify original array. Explicit copying is required if needed.

Arrays can also be boolean indexed based on conditional filters:

filter = arr > 5
arr[filter] # [6 7 8 9]

Array Operations

NumPy makes array operations fast and convenient. Common mathematical operations are overloaded as vectorized element-wise operations:

arr = np.array([1, 2, 3])

arr + 2 # [3 4 5]
arr - 1 # [0 1 2]
arr * 10 # [10 20 30]
arr / 2 # [0.5 1 1.5]

Other operations like trigonometric, exponential, etc are similarly overloaded:

np.sin(arr)
np.log(arr)
np.abs(arr)

Matrix operations use @ for dot product:

matrix_a @ matrix_b

Benefits: No slow Python loops needed! Operations are fast and applied element-wise.

Broadcasting

Broadcasting allows vectorized operations between arrays of different shapes. NumPy expands dimensions of smaller arrays to “broadcast” along larger array:

a = np.array([1, 2, 3]) # Shape (3,)

b = np.array([[10], [20], [30]]) # Shape (3, 1)

a + b # Shape (3, 3) with broadcasting

"""
[[11 12 13]
 [21 22 23]
 [31 32 33]]
"""

Rules:

  1. Dimensions are expanded from left to right.
  2. Arrays must have equal final dimensions.
  3. Copies are avoided where possible.

Broadcasting prevents slow for-loops and enables fast vectorized calculations.

Array Manipulations

NumPy provides various manipulation methods like sorting, reshaping, joining, splitting, appending, etc:

arr = np.random.randint(10, size=6) # One dimensional

arr.sort() # In-place sorting

arr = arr.reshape(2, 3) # Reshape to two-dimensional

arr = np.vstack([arr, arr]) # Stack arrays vertically

arr = np.hstack([arr, arr]) # Stack arrays horizontally

arr = np.append(arr, [11, 12]) # Append new values

lower, upper = np.split(arr, 2) # Split array at index 2

Other functions like concatenate, delete, insert, etc provide more flexibility.

Array Comparisons

Element-wise comparisons produce boolean arrays:

arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 2, 3, 4])

arr1 == arr2 # [False  True  True  True]

arr1 < arr2 # [ True False False False]

arr1 != arr2 # [ True False  True False]

Can directly use comparison operators like <, >, ==, etc.

Logic operators like & (and), | (or) are also overloaded for arrays.

Input and Output

Converting arrays to and from other formats:

To/from NumPy

To Python

From CSV

Display

Advanced Topics

This provides an overview of ndarray basics. NumPy has many advanced features:

Summary

Key points about NumPy ndarrays:

Ndarrays are essential for any Python programmer working with data and NumPy is a must-have library for array programming. This guide covers the basics, but there is much more to explore!