Page History

...

Code Block

language	py

def convert_model(sym, arg_params, aux_params, target_dtype="float16", target_dtype_ops=None,
fp32_ops=None, widest_dtype_ops=None,
conditional_fp32_ops=None, excluded_sym_names=None):
"""API for converting a model from FP32 model to a mixed precision model.
MXNet tries to convert the FP32 model to mixed precision model by adding
cast layers using amp_cast and amp_multicast operators. The decision on
which cast layer to add is based on hardcoded lists for Automatic Mixed Precision
in MXNet. These lists can be overridden by the user by providing their own lists
using : targe_precision_ops, fp32_ops, widest_precision_ops, conditional_fp32_ops

Parameters
----------
sym : str or Symbol
Defines the structure of a neural network for FP32 types.
arg_params : dict
Dictionary of name to `NDArray`.
aux_params : dict
Dictionary of name to `NDArray`.
target_dtype : str
Currently only supports float16. The target dtype indicates to add cast layers
when possible so that lower precision computation can be leveraged.
target_dtype_ops : list of strs
Override the list of operator names casted to target_dtype.
If None, uses the framework's default list to be casted to target dtype.
fp32_ops : list of strs
Override the lists of operator names casted to FP32.
If None, uses the framework's default list to be casted to FP32.
widestconditional_dtypefp32_ops : list of strs
A (string, string, list of op names provided by user which should run in widest precision among its inputs.
If None, uses the framework's default list of widest_precision_ops.
conditional_fp32_ops : list of (string, string, list of string)
Override the list of operatorsstring)
Override the list of operators to be casted to FP32.
The format of the list is
(name of the function, name of the parameter,
list of values of the parameter that make the operator to be casted to FP32.
The format of the list is
(name of the function, name of the parameter,
list of values of the parameter that make the operator to be casted to
fp32)
excluded_sym_names : list of strs
A list of strings that represent the names of symbols that users want to exclude
from being quantized.
"""
fp32)
excluded_sym_names : list of strs
A list of strings that represent the names of symbols that users want to exclude
from being quantized.
"""

target_dtype should decide target_dtype should decide which lists need to be overridden.
For example, in the future bfloat16 support may be added in which case these lists for operators running in bfloat16 will also be added to AMP.
In this case, target_dtype will allow users to choose the right dtype for the mixed precision model.

...

Code Block

language	py

def convert_block(block, target_dtype="float16", target_dtype_ops=None,
                  fp32_ops=None, widest_dtype_ops=None, conditional_fp32_ops=None,
                  excluded_sym_names=None, input_names=['data']):
    """Given a hybrid block/symbol block representing a neural network of data type FP32 and target_dtype,
    return a block with mixed precision support

    Parameters
    ----------
    block : HybridBlock or SymbolBlock object
        FP32 HybridBlock or SymbolBlock object
    target_dtype : str or numpy
        currently only supports float16. The target dtype indicates to add cast layers
        when possible so that lower precision computation can be leveraged.
    target_precision_ops : list of strs
        Override the list of operator names casted to target_dtype.
        If None, uses the framework's default list to be casted to target dtype.
    fp32_ops : list of strs
        Override the lists of operator names casted to FP32.
        If None, uses the framework's default list to be casted to FP32.
    widestconditional_precisionfp32_ops : list of (string, string, list of strsstring)
        Override the list of operatorfunctions namescasted whichto shouldFP32.
 run in widest precision among its
  The format of the list is
 input arguments.
      (name of Ifthe Nonefunction, name usesof the framework's default list of widest_precision_ops.
parameter,
       conditional_fp32_ops : list of (string, string, list of string)
        Override the list of functions casted to FP32.values of the parameter that make the operator to be casted to
        fp32)
    excluded_sym_names : list of Thestrs
 format of the list is
        (nameA list of thestrings function, name ofthat represent the parameter,
         listnames of values of the parametersymbols that makeusers the operatorwant to beexclude
 casted to
      from being fp32)quantized.
    excluded_syminput_names : list of strs
        A list of strings that represent the names of symbols that users want to exclude
        from being quantized.
    input_names : list of strs
        A list of strings representingrepresenting the names of input variables
	"""

...

Setup

EC2 Instance: p3.8xlarge

CUDNN: 7.4.2

CUDA: 10.0

Commit Hash: b3b952f9d5490ee2707209ab866e6c3f094e2046 (PoC changes made on top of this built from source)

...

imagenet1k-resnet-152: JSON File, Params File

Results

Model (Samples/sec)	Batch Size	Original Model (Samples/sec)	Mixed Precision Model (Samples/sec)	Original Model with Implicit Type Conversion (MXNET_CUDA_TENSOR_OP_MATH_ALLOW_CONVERSION=1) (Samples/sec)
imagenet1k-resnet-152	1	85	72	72
	2	140	140	142
	4	240	270	228
	8	320	470	261
	16	405	680	315
resnet50_v1	1	215	165	205
	2	370	330	365
	4	560	600	545
	8	760	980	635
	16	935	1400

1400

790

FAQ

Will the arg_params and aux_params be casted to fp16 ?

Depends on the whitelists provided. The default whitelists have been selected in a way to avoid casting of the params, for commonly used convnet networks. If the whitelist is such that the type inference decides that certain param needs to be float16 then it will be castedInputs of ops in FP16 will be casted. Other params may or may not be casted based on the type inference logic.

How is this different from casting inputs to FP16 and casting params to FP16 in Gluon ?

...

Page tree

Versions Compared

Old Version 12

New Version Current

Key

Setup

Results

FAQ

Will the arg_params and aux_params be casted to fp16 ?

How is this different from casting inputs to FP16 and casting params to FP16 in Gluon ?