# Manipulate data the MXNet way with `ndarray`

¶

It’s impossible to get anything done if we can’t manipulate data. This has two parts - loading data and processing data once it’s inside the computer. This notebook is about the latter. So let’s start by introducing NDArrays, MXNet’s primary tool for storing and transforming data. If you’ve worked with NumPy before, you’ll notice that NDArrays are by design similar to NumPy’s multi-dimensional array. However, they confer a few key advantages. First, NDArrays support asynchronous computation on CPU, GPU, and distributed cloud architectures. Second, they provide support for automatic differentiation. These properties make NDArray an ideal library for machine learning, both for researchers and engineers launching production systems.

## Getting started¶

In this chapter, we’ll get you going with the basic functionality. Don’t worry if you don’t understand any of the basic math, like element-wise operations or normal distributions. In the next two chapters we’ll take another pass at NDArray, teaching you both the math you’ll need and how to realize it in code.

To get started, let’s import `mxnet`

. We’ll also import `ndarray`

from `mxnet`

for convenience. We’ll make a habit of setting a random
seed so that you always get the same results that we do.

```
In [1]:
```

```
import mxnet as mx
from mxnet import nd
mx.random.seed(1)
```

Next, let’s see how to create an NDArray, without any values
initialized. Specifically, we’ll create a 2D array (also called a
*matrix*) with 3 rows and 4 columns.

```
In [2]:
```

```
x = nd.empty((3, 4))
print(x)
```

```
[[ -1.42576390e+27 4.56641131e-41 1.92930952e-37 0.00000000e+00]
[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
[ 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]]
<NDArray 3x4 @cpu(0)>
```

The `empty`

method just grabs some memory and hands us back a matrix
without setting the values of any of its entries. This means that the
entries can have any form of values, including very big ones! But
typically, we’ll want our matrices initialized. Commonly, we want a
matrix of all zeros.

```
In [3]:
```

```
x = nd.zeros((3, 5))
x
```

```
Out[3]:
```

```
[[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
<NDArray 3x5 @cpu(0)>
```

Similarly, `ndarray`

has a function to create a matrix of all ones.

```
In [4]:
```

```
x = nd.ones((3, 4))
x
```

```
Out[4]:
```

```
[[ 1. 1. 1. 1.]
[ 1. 1. 1. 1.]
[ 1. 1. 1. 1.]]
<NDArray 3x4 @cpu(0)>
```

Often, we’ll want to create arrays whose values are sampled randomly. This is especially common when we intend to use the array as a parameter in a neural network. In this snippet, we initialize with values drawn from a standard normal distribution with zero mean and unit variance.

```
In [5]:
```

```
y = nd.random_normal(0, 1, shape=(3, 4))
y
```

```
Out[5]:
```

```
[[-0.67765152 0.03629481 0.10073948 -0.49024421]
[ 0.57595438 -0.95017916 -0.3469252 0.03751944]
[-0.22134334 -0.72984636 -1.80471897 -2.04010558]]
<NDArray 3x4 @cpu(0)>
```

As in NumPy, the dimensions of each NDArray are accessible via the
`.shape`

attribute.

```
In [6]:
```

```
y.shape
```

```
Out[6]:
```

```
(3, 4)
```

We can also query its size, which is equal to the product of the components of the shape. Together with the precision of the stored values, this tells us how much memory the array occupies.

```
In [7]:
```

```
y.size
```

```
Out[7]:
```

```
12
```

## Operations¶

NDArray supports a large number of standard mathematical operations. Such as element-wise addition:

```
In [8]:
```

```
x + y
```

```
Out[8]:
```

```
[[ 0.32234848 1.03629482 1.10073948 0.50975579]
[ 1.57595444 0.04982084 0.6530748 1.03751945]
[ 0.77865666 0.27015364 -0.80471897 -1.04010558]]
<NDArray 3x4 @cpu(0)>
```

Multiplication:

```
In [9]:
```

```
x * y
```

```
Out[9]:
```

```
[[-0.67765152 0.03629481 0.10073948 -0.49024421]
[ 0.57595438 -0.95017916 -0.3469252 0.03751944]
[-0.22134334 -0.72984636 -1.80471897 -2.04010558]]
<NDArray 3x4 @cpu(0)>
```

And exponentiation:

```
In [10]:
```

```
nd.exp(y)
```

```
Out[10]:
```

```
[[ 0.50780815 1.03696156 1.1059885 0.61247683]
[ 1.77882743 0.38667175 0.70685822 1.03823221]
[ 0.80144149 0.48198304 0.16452068 0.13001499]]
<NDArray 3x4 @cpu(0)>
```

We can also grab a matrix’s transpose to compute a proper matrix-matrix product.

```
In [11]:
```

```
nd.dot(x, y.T)
```

```
Out[11]:
```

```
[[-1.03086138 -0.68363053 -4.79601431]
[-1.03086138 -0.68363053 -4.79601431]
[-1.03086138 -0.68363053 -4.79601431]]
<NDArray 3x3 @cpu(0)>
```

We’ll explain these opoerations and present even more operators in the linear algebra chapter. But for now, we’ll stick with the mechanics of working with NDArrays.

## In-place operations¶

In the previous example, every time we ran an operation, we allocated
new memory to host its results. For example, if we write `y = x + y`

,
we will dereference the matrix that `y`

used to point to and insted
point it at the newly allocated memory. We can show this using Python’s
`id()`

function, which tells us precisely which object a variable
refers to.

```
In [12]:
```

```
print('id(y):', id(y))
y = y + x
print('id(y):', id(y))
```

```
id(y): 139962637796688
id(y): 139962656686544
```

We can assign the result to a previously allocated array with slice
notation, e.g., `result[:] = ...`

.

```
In [13]:
```

```
z = nd.zeros_like(x)
print('id(z):', id(z))
z[:] = x + y
print('id(z):', id(z))
```

```
id(z): 139962637938984
id(z): 139962637938984
```

However, `x+y`

here will still allocate a temporary buffer to store
the result before copying it to z. To make better use of memory, we can
perform operations in place, avoiding temporary buffers. To do this we
specify the `out`

keyword argument every operator supports:

```
In [14]:
```

```
nd.elemwise_add(x, y, out=z)
```

```
Out[14]:
```

```
[[ 1.32234848 2.03629494 2.10073948 1.50975585]
[ 2.57595444 1.0498209 1.65307474 2.03751945]
[ 1.77865672 1.27015364 0.19528103 -0.04010558]]
<NDArray 3x4 @cpu(0)>
```

If we’re not planning to re-use `x`

, then we can assign the result to
`x`

itself. There are two ways to do this in MXNet. 1. By using slice
notation x[:] = x op y 2. By using the op-equals operators like `+=`

```
In [15]:
```

```
print('id(x):', id(x))
x += y
x
print('id(x):', id(x))
```

```
id(x): 139962637796520
id(x): 139962637796520
```

```
In [16]:
```

```
```