At this point we’ve already covered quite a lot of ground. We know how to manipulate data and labels. We know how to construct flexible models capable of expressing plausible hypotheses. We know how to fit those models to our dataset. We know of loss functions to use for classification and for regression, and we know how to minimize those losses with respect to our models’ parameters. We even know how to write our own neural network layers in gluon.

But even with all this knowledge, we’re not ready to build a real machine learning system. That’s because we haven’t yet covered how to save and load models. In reality, we often train a model on one device and then want to run it to make predictions on many devices simultaneously. In order for our models to persist beyond the execution of a single Python script, we need mechanisms to save and load NDArrays, gluon Parameters, and models themselves.

In [1]:

from __future__ import print_function
import mxnet as mx
from mxnet import gluon
ctx = mx.cpu()
# ctx = mx.gpu()


To start, let’s show how you can save and load a list of NDArrays for future use. Note that while it’s possible to use a general Python serialization package like Pickle, it’s not optimized for use with NDArrays and will be unnecessarily slow. We prefer to use ndarray.save and ndarray.load.

In [2]:

X = nd.ones((100, 100))
Y = nd.zeros((100, 100))
import os
os.makedirs('checkpoints', exist_ok=True)
filename = "checkpoints/test1.params"
nd.save(filename, [X, Y])


It’s just as easy to load a saved NDArray.

In [3]:

A, B = nd.load(filename)
print(A)
print(B)


[[ 1.  1.  1. ...,  1.  1.  1.]
[ 1.  1.  1. ...,  1.  1.  1.]
[ 1.  1.  1. ...,  1.  1.  1.]
...,
[ 1.  1.  1. ...,  1.  1.  1.]
[ 1.  1.  1. ...,  1.  1.  1.]
[ 1.  1.  1. ...,  1.  1.  1.]]
<NDArray 100x100 @cpu(0)>

[[ 0.  0.  0. ...,  0.  0.  0.]
[ 0.  0.  0. ...,  0.  0.  0.]
[ 0.  0.  0. ...,  0.  0.  0.]
...,
[ 0.  0.  0. ...,  0.  0.  0.]
[ 0.  0.  0. ...,  0.  0.  0.]
[ 0.  0.  0. ...,  0.  0.  0.]]
<NDArray 100x100 @cpu(0)>


We can also save a dictionary where the keys are strings and the values are NDArrays.

In [4]:

mydict = {"X": X, "Y": Y}
filename = "checkpoints/test2.params"
nd.save(filename, mydict)

In [5]:

C = nd.load(filename)
print(C)

{'X':
[[ 1.  1.  1. ...,  1.  1.  1.]
[ 1.  1.  1. ...,  1.  1.  1.]
[ 1.  1.  1. ...,  1.  1.  1.]
...,
[ 1.  1.  1. ...,  1.  1.  1.]
[ 1.  1.  1. ...,  1.  1.  1.]
[ 1.  1.  1. ...,  1.  1.  1.]]
<NDArray 100x100 @cpu(0)>, 'Y':
[[ 0.  0.  0. ...,  0.  0.  0.]
[ 0.  0.  0. ...,  0.  0.  0.]
[ 0.  0.  0. ...,  0.  0.  0.]
...,
[ 0.  0.  0. ...,  0.  0.  0.]
[ 0.  0.  0. ...,  0.  0.  0.]
[ 0.  0.  0. ...,  0.  0.  0.]]
<NDArray 100x100 @cpu(0)>}


## Saving and loading the parameters of gluon models¶

Recall from our first look at the plumbing behind gluon blocks <P03.5-C01-plumbing.ipynb%5D>__ that gluon wraps the NDArrays corresponding to model parameters in Parameter objects. We’ll often want to store and load an entire model’s parameters without having to individually extract or load the NDarrays from the Parameters via ParameterDicts in each block.

Fortunately, gluon blocks make our lives very easy by providing a .save_params() and .load_params() methods. To see them in work, let’s just spin up a simple MLP.

In [6]:

num_hidden = 256
num_outputs = 1
net = gluon.nn.Sequential()
with net.name_scope():


Now, let’s initialize the parameters by attaching an initializer and actually passing in a datapoint to induce shape inference.

In [7]:

net.collect_params().initialize(mx.init.Normal(sigma=1.), ctx=ctx)
net(nd.ones((1, 100), ctx=ctx))

Out[7]:


[[ 362.53265381]]
<NDArray 1x1 @cpu(0)>


So this randomly initialized model maps a 100-dimensional vector of all ones to the number 362.53 (that’s the number on my machine–your mileage may vary). Let’s save the parameters, instantiate a new network, load them in and make sure that we get the same result.

In [8]:

filename = "checkpoints/testnet.params"
net.save_params(filename)
net2 = gluon.nn.Sequential()
with net2.name_scope():

Out[8]: