Problem

Taking a MXNet model from training and deploying it in production poses several challenges to the users. Most important problems raised by users are:

Input/Output data transformations are not part of the MXNet model.
Input/Output signature are not part of the model: Saved model missing the information about the input/output descriptions, like name/shape, making the saved model unusable out of the box.
File naming - Epoch number is part of model file name. This information may not be necessary for production deployment.

Proposed Solution

In Gluon, update the Export API to accept input/output signature, pre-processing/post-processing transformations (Hybrid Sequential Block) from the users.
If user provides the transformations, a fused graph (Transformations + Batchify_Connector_Node + Network + No_Op_Connector_Node + Transformations) is prepared. Batchify_Connector_Node and No_Op_Connector_Node are MXNet internal identifier operators to help in identifying transformations and network graphs separately. This identifier nodes will be helpful, if when loading the model, user prefers to not load the transformations.
Input/Output signature are added to the symbol file.

Example symbol file after proposed update update. "inputs" and "outputs" are the new parameters proposed in this work.

{
    "nodes": [..........]
    "arg_nodes": [0, 1, 2, 4, 5],
    "node_row_ptr": [0, 1, 2, 3, 4, 5, 6, 7, 8],
    "heads": [[7, 0, 0]],
    "attrs": {"mxnet_version": ["int", 10301]
              "inputs": {"data":[1,3,224,224]},
              "outputs" : {"softmax_label":[1,10]
              }
}

Possible use cases

Inputs and outputs are dictionary where key => Name of the node and value => shape. Though single input, single output models are more common, a model can be of following variations:

Single input, single output
Single input, multiple outputs
Multiple input with same or different transformation, single output
Multiple input with same or different transformation, multiple output

User Experience

Before

Step 1 - Train and export the model from Gluon

# Trained network
net = mx.gluon.model_zoo.vision.resnet18_v1(pretrained=True, ctx=mx.cpu())

# Data transformations applicable during inference
inference_input_transforms = gluon.nn.HybridSequential()
inference_input_transforms.add(transforms.Resize((224, 224)))
inference_input_transforms.add(transforms.ToTensor())
inference_input_transforms.add(transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)))

# Export the model. Cannot export data transformation and input/output signature
net.export(path="./my_model", epoch=0)

Step 2 - Import the model for inference with Module OR Gluon OR JAVA

Import the model in Gluon

# Load the model. Model does not contain transformation and input/output signature 
net = SymbolBlock.imports(symbol_file="my_model-symbol.json",
                          input_names=["data"],
                          param_file="my_model-0000.params",
                          ctx=mx.cpu())

Import the model in Module API

sym, arg_params, aux_params = mx.model.load_checkpoint('my_model', 0)
mod = mx.mod.Module(symbol=sym, context=ctx, label_names=None)
mod.bind(for_training=False, data_shapes=[('data', (1,3,224,224))], 
         label_shapes=mod._label_shapes)
mod.set_params(arg_params, aux_params, allow_missing=True)

mod.forward(...)

Import the model in Java Predictor API

Shape inputShape = new Shape(new int[] {1,3,224,224});
DataDesc inputDescriptor = new DataDesc("data", inputShape, DType.Float32(), "NCHW"); 
List<DataDesc> inputDescList = new ArrayList<DataDesc>();
inputDescList.add(inputDescriptor);
List<Context> context = new ArrayList<>();
context.add(Context.cpu()); 
String modelPathPrefix = "path-to-model";
Predictor predictor = new Predictor(modelPathPrefix, inputDescList, context);

List<NDArray> result = predictor.predictWithNDArray(inputNDArray);

After

Step 1 - Train and export the model from Gluon

# Trained network
net = mx.gluon.model_zoo.vision.resnet18_v1(pretrained=True, ctx=mx.cpu())

# Data transformations applicable during inference
inference_input_transforms = gluon.nn.HybridSequential()
inference_input_transforms.add(transforms.Resize((224, 224)))
inference_input_transforms.add(transforms.ToTensor())
inference_input_transforms.add(transforms.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)))

# Export the model
gluon.contrib.utils.export(net, path="./my_model", 
                           epoch=0, 
                           signature={constants.INPUT_DESC:[("data", (1,3,224,224))],
                                      constants.OUTPUT_DESC:[("softmax_label", (1,10))]},
                           input_transforms={"data":inference_input_transforms},
                           output_transforms=None)

Step 2 - Import the model for inference with Module OR Gluon OR JAVA

Import the Model in Gluon

gluon.contrib.utils.import(symbol_file="my_model-symbol.json",
                           param_file="my_model-0000.params",
                           load_transforms = True,
                           ctx = 'cpu')

Import the Model in Module API

(Supported to create a module for inference only)

mod = mx.contrib.Module.import(
                symbol_file = "my_model-symbol.json",
                param_file = "my_model-0000.params",
                load_transforms = True,
                ctx = 'cpu',
                batch_size = 1)
mod.forward(...)

Import the Model in Java Predictor API

List<Context> context = new ArrayList<>();
context.add(Context.cpu()); 
String modelPathPrefix = "my_model";
Predictor predictor = new Predictor(modelPathPrefix, context, load_transforms=True);

List<NDArray> result = predictor.predictWithNDArray(inputNDArray);

Backward compatibility

All APIs changes are backward compatible.
Old MXNet model should still load without breakage with new MXNet version.
New MXNet model will not work on old MXNet versions.

Alternate Solution

Keep transformation graph and network graph separately independent of each other and fuse them at run time. In the proposed approach, we fuse the transformation and neural network and export as single graph. We introduce a no_op_identifier operator to identify the link between transformation and neural network. Another solution would be to keep the transformation and network graph separately (in same symbol file or multiple symbol file). These independent graphs can then be fused during run time.

Page tree

Problem

Proposed Solution

Possible use cases

User Experience

Before

Step 1 - Train and export the model from Gluon

Step 2 - Import the model for inference with Module OR Gluon OR JAVA

Import the model in Gluon

Import the model in Module API

Import the model in Java Predictor API

After

Step 1 - Train and export the model from Gluon

Step 2 - Import the model for inference with Module OR Gluon OR JAVA

Import the Model in Gluon

Import the Model in Module API

Import the Model in Java Predictor API

Backward compatibility

Alternate Solution

References

Page tree

Extending MXNet Model save/load APIs

Problem

Proposed Solution

Possible use cases

User Experience

Before

Step 1 - Train and export the model from Gluon

Step 2 - Import the model for inference with Module OR Gluon OR JAVA

Import the model in Gluon

Import the model in Module API

Import the model in Java Predictor API

After

Step 1 - Train and export the model from Gluon

Step 2 - Import the model for inference with Module OR Gluon OR JAVA

Import the Model in Gluon

Import the Model in Module API

Import the Model in Java Predictor API

Backward compatibility

Alternate Solution

References