Problems

Taking a MXNet model from training and deploying it in production poses several challenges to the users. Most important problems raised by users are:

Input/Output data transformations are not part of the MXNet model.
Input/Output signature are not part of the model: Saved model missing the information about the input/output descriptions, like name/shape, making the saved model unusable out of the box.
File naming - Epoch number is part of model file name. This information may not be necessary for production deployment.

Proposed Solution

In Gluon, update the Export API to accept input/output signature, pre-processing/post-processing transformations (Hybrid Sequential Block) from the users.
If user provides the transformations, a fused graph (Transformations + Batchify_Connector_Node + Network + No_Op_Connector_Node + Transformations) is prepared. Batchify_Connector_Node and No_Op_Connector_Node are MXNet internal identifier operators to help in identifying transformations and network graphs separately. This identifier nodes will be helpful, if when loading the model, user prefers to not load the transformations.
Input/Output signature are added to the symbol file.

Example symbol file after proposed update update. "inputs" and "outputs" are the new parameters proposed in this work.

{
    "nodes": [..........]
    "arg_nodes": [0, 1, 2, 4, 5],
    "node_row_ptr": [0, 1, 2, 3, 4, 5, 6, 7, 8],
    "heads": [[7, 0, 0]],
    "attrs": {"mxnet_version": ["int", 10301]
              "inputs": {"data":[1,3,224,224]},
              "outputs" : {"softmax_label":[1,10]
              }
}

Use cases

Single input, single output
Single input, multiple outputs
Multiple input with same or different transformation, single output
Multiple input with same or different transformation, multiple output

Export API - Gluon

Since this is an update to the most commonly used API - Export in Gluon, we will first introduce the change with a new Export API in gluon.contrib.

Before

net.export(path="./my_model", epoch=0)

After

## net => Hybrid Sequential Network
## input_transforms => Hybrid Sequential Transformation Pipeline
gluon.contrib.utils.export(net, path="./my_model", 
                           epoch=0, 
                           signature={constants.INPUT_DESC:[("data", (1,3,224,224))],
                                      constants.OUTPUT_DESC:[("softmax_label", (1,10))]},
                           input_transforms={"data":input_transforms},
                           output_transforms=None)

Import API - Gluon

Before

SymbolBlock.imports(symbol_file="my_model-symbol.json",
                    input_names=["data"],
                    param_file="my_model-0000.params",
                    ctx='cpu')

After

gluon.contrib.utils.import(symbol_file="my_model-symbol.json",
                           param_file="my_model-0000.params",
                           load_transforms = True,
                           ctx = 'cpu')

Import API - Module

Before

sym, arg_params, aux_params = mx.model.load_checkpoint('my_model', 0)
mod = mx.mod.Module(symbol=sym, context=ctx, label_names=None)
mod.bind(for_training=False, data_shapes=[('data', (1,3,224,224))], 
         label_shapes=mod._label_shapes)
mod.set_params(arg_params, aux_params, allow_missing=True)

mod.forward(...)

After
Supported to create a module for inference only.

mod = mx.contrib.Module.import(
                symbol_file = "my_model-symbol.json",
                param_file = "my_model-0000.params",
                load_transforms = True,
                ctx = 'cpu',
                batch_size = 1)
mod.forward(...)

Import API - Java predictor

Before

Shape inputShape = new Shape(new int[] {1,3,224,224});
DataDesc inputDescriptor = new DataDesc("data", inputShape, DType.Float32(), "NCHW"); 
List<DataDesc> inputDescList = new ArrayList<DataDesc>();
inputDescList.add(inputDescriptor);
List<Context> context = new ArrayList<>();
context.add(Context.cpu()); 
String modelPathPrefix = "path-to-model";
Predictor predictor = new Predictor(modelPathPrefix, inputDescList, context);

List<NDArray> result = predictor.predictWithNDArray(inputNDArray);

After

List<Context> context = new ArrayList<>();
context.add(Context.cpu()); 
String modelPathPrefix = "my_model";
Predictor predictor = new Predictor(modelPathPrefix, context, load_transforms=True);

List<NDArray> result = predictor.predictWithNDArray(inputNDArray);

Backward compatibility

All APIs changes are backward compatible.
Old MXNet model should still load without breakage with new MXNet version.
New MXNet model will not work on old MXNet versions.

Alternate Solution

Keep transformation graph and network graph separately independent of each other and fuse them at run time. In the proposed approach, we fuse the transformation and neural network and export as single graph. We introduce a no_op_identifier operator to identify the link between transformation and neural network. Another solution would be to keep the transformation and network graph separately (in same symbol file or multiple symbol file). These independent graphs can then be fused during run time.

Page tree

Problems

Proposed Solution

Use cases

Export API - Gluon

Import API - Gluon

Import API - Module

Import API - Java predictor

Backward compatibility

Alternate Solution

References

Page tree

Extending MXNet Model save/load APIs

Problems

Proposed Solution

Use cases

Export API - Gluon

Import API - Gluon

Import API - Module

Import API - Java predictor

Backward compatibility

Alternate Solution

References