Problems

Taking a MXNet model from training and deploying it in production poses several challenges to the users. Most important problems raised by users are:

Input/Output data transformations are not part of the MXNet model.
Input/Output signature are not part of the model: Saved model missing the information about the input/output descriptions, like name/shape, making the saved model unusable out of the box.
File naming - Epoch number is part of model file name. This information may not be necessary for production deployment.

Proposed Solution

In Gluon, update Export API to accept input/output signature, pre-processing/post-processing transformations (Hybrid Sequential Block) from the users.
If user provides the transformations, a fused graph (Transformations + Batchify_Connector_Node + Network + NoOp_Connector_Node + Transformations) is prepared. Connector No Op Nodes are MXNet internal identifier operators to help in identifying transformations and network graphs separately. This identifier nodes will be helpful, if when loading the model, user prefers to not load the transformations.
Input/Output signature are added to the symbol file.

Example symbol file after proposed update update

Code Block
{

...


    "nodes": [..........]

...


    "arg_nodes": [0, 1, 2, 4, 5],

...


    "node_row_ptr": [0, 1, 2, 3, 4, 5, 6, 7, 8],

...


    "heads": [[7, 0, 0]],

...


    "attrs": {"mxnet_version": ["int", 10301]

...


              "inputs": {"data":[1,3,224,224]},

...


              "outputs" : {"softmax_label":[1,10]

...

Use cases

Single input, single output
Single input, multiple outputs
Multiple input with same or different transformation, single output
Multiple input with same or different transformation, multiple output

Export API - Gluon

Since this is an update to the most commonly used API - Export in Gluon, we will first introduce the change with a new Export API in gluon.contrib.

Before

Code Block

language	py

net.export(path="./my_model", epoch=0)

After

Code Block

language	py

## net => Hybrid Sequential Network

...


## input_transforms => Hybrid Sequential Transformation Pipeline

...


gluon.contrib.utils.export(net, path="./my_model",

...


                           epoch=0,

...


                           signature={constants.INPUT_DESC:[("data", (1,3,224,224))],

...


                                      constants.OUTPUT_DESC:[("softmax_label", (1,10))]},

...


                           input_transforms=input_transforms,

...


                           output_transforms=None)

Import API - Gluon

Before

Code Block

language	py

SymbolBlock.imports(symbol_file="my_model-symbol.json",

...


                    input_names=["data"],

...


                    param_file="my_model-0000.params",

...


                    ctx='cpu')

After

Code Block

language	py

gluon.contrib.utils.import(symbol_file="my_model-symbol.json",

...


                           param_file="my_model-0000.params",

...


                           load_transforms = True,

...


                           ctx = 'cpu')

Import API - Module

Before

Code Block

language	py

sym, arg_params, aux_params = mx.model.load_checkpoint('my_model', 0)

...


mod = mx.mod.Module(symbol=sym, context=ctx, label_names=None)

...


mod.bind(for_training=False, data_shapes=[('data', (1,3,224,224))],

...


         label_shapes=mod._label_shapes)

...


mod.set_params(arg_params, aux_params, allow_missing=True)

...



mod.forward(...)

After
Supported to create a module for inference only.

Code Block

language	py

mod = mx.contrib.Module.import(

...


                symbol_file = "my_model-symbol.json",

...


                param_file = "my_model-0000.params",

...


                load_transforms = True,

...


                ctx = 'cpu',

...


                batch_size = 1)

...


mod.forward(...)

...

Import API - Java predictor

Before

Code Block

language	java

Shape inputShape = new Shape(new int[] {1,3,224,224});

...


DataDesc inputDescriptor = new DataDesc("data", inputShape, DType.Float32(), "NCHW");

...


List<DataDesc> inputDescList = new ArrayList<DataDesc>();

...


inputDescList.add(inputDescriptor);

...


List<Context> context = new ArrayList<>();

...


context.add(Context.cpu());

...


String modelPathPrefix = "path-to-model";

...


Predictor predictor = new Predictor(modelPathPrefix, inputDescList, context);

...



List<NDArray> result = predictor.predictWithNDArray(inputNDArray);

After

Code Block

language	java

List<Context> context = new ArrayList<>();

...


context.add(Context.cpu());

...


String modelPathPrefix = "my_model";

...


Predictor predictor = new Predictor(modelPathPrefix, context, load_transforms=True);

...



List<NDArray> result = predictor.predictWithNDArray(inputNDArray);

Backward compatibility

All APIs changes are backward compatible.
Old MXNet model should still load without breakage with new MXNet version.
New MXNet model will not work on old MXNet versions.

...

Page tree

Versions Compared

Old Version 1

New Version 2

Key

Problems

Proposed Solution

Use cases

Export API - Gluon

Import API - Gluon

Import API - Module

Import API - Java predictor

Backward compatibility

Page tree

Page History

Versions Compared

Old Version 1

New Version 2

Key

Problems

Proposed Solution

Use cases

Export API - Gluon

Import API - Gluon

Import API - Module

Import API - Java predictor

Backward compatibility