New features

...

MXNet Extensions: custom operators, partitioning, and graph passes

Adds support for extending MXNet with custom operators, partitioning strategies, and graph passes. All implemented in a library easily compiled separately from the MXNet codebase, and dynamically loaded at runtime into any prebuilt installation of MXNet.

fix for number of inputs/outputs for backward custom ops (#17069)
Enhancements for custom subgraph op (#17194)
Disable flaky test_custom_op_fork (#17481)
fix custom op makefile (#17516)
Update CustomOp doc with changes for GPU support (#17486)
[WIP] MXNet Extensions enhancements (#17885) (#18128)
Dynamic subgraph property (#17034)
Dynamic subgraph property doc (#17585)
[1.7] Backport MXNet Extension PRs (#17623, #17569, #17762) #18063 (#18069)

OpPerf utility enabled in the

...

binary distribution

[OpPerf] Add Neural network loss ops (#17482)
[OpPerf] Fixes the issue when you pass NDArray to run_perf_test (#17508)
[OpPerf] Fix markdown for native profile and add profile param in function desc (#17494)
[OpPerf] Add Indexing ops (#16253)
[OpPerf] Implement remaining random sampling ops (#17502)
[OpPerf] Implement remaining GEMM ops (#17501)
[OpPerf] Implement all linalg ops (#17528)
[OpPerf] Fixed native output ordering, added warmup & runs command line args (#17571)
[OpPerf] Add norm, cast ops, remaining optimizer ops (#17542)
[Large Tensor] Fixed Embedding op (#17599)
[OpPerf] Fixed Python profiler bug (#17642)

MKL-DNN

MKL-DNN as the default CPU backend in binary distribution

Branding change to DNNL

Upgrade MKL-DNN dependency to v1.1 (#16823)

...

[New Op] Add deformable conv v2 (#16341)
Add MXNet Ops for fast multihead attention (#16408)
Support boolean elemwise/broadcast binary add, multiply and true_divide (#16728)
add gammaln, erf, erfinv (#16811)
add aligned roi introduced in Detectron2 (#16619)
Implement atleast_1d/2d/3d (#17099)
Interleaved MHA for CPU path (#17138)
Lamb optimizer update (#16715)
Quantized Embedding (#16691)
Add gelu fuse ops (#18082) (#18092)

Feature improvements

Numpy compatible interface(experimental)

[NumPy] NumPy support for linalg.inv (#16730)
add numpy op nan_to_num (#16717)
[Numpy] Add sampling method for bernoulli (#16638)
Fix numpy-compatible mean output type for integer inputs (#16792)
[Numpy] Fix collect_params().zero_grad() in gluon numpy interface (#16716)
[Numpy][Operator] 'where' Implementation in MXNet (#16829)
[Numpy] Random.normal() with backward (#16330)
Add OP diag [numpy] (#16786)
Mixed precison binary op backward (use in) for numpy (#16791)
add numpy op diagflat [numpy] (#16813)
add op bitwise_or [numpy] (#16801)
[Numpy] Implementation npx.{sample}_n (#16876)
[Numpy] Add NumPy support for np.linalg.det and np.linalg.slogdet (#16800)
Op Unravel_index PR [Numpy] (#16862)
[Numpy] Fix imperative basic indexing in numpy (#16902)
[Numpy] Basic indexing in symbolic interface of DeepNumpy (#16621)
[Numpy] add op full_like, c++ impl, fix zeros_like, ones_like type inference (#16804)
[Numpy] Implement numpy operator 'average' (#16720)
[Bugfix] [Numpy] Add `kAddTo` and kNullOp to Transpose (#16979)
set rtol = 1e-2 and atol = 1e-4 when dtype == np.float32 in test_numpy_op.py:test_np_linalg_solve (#17025)
Op_Diagonal [Numpy] (#16989)
numpy bincount (#16965)
[numpy] add op bitwise_not (#16947)
[Numpy ]Modify np.random.shuffle to enable inplace by default (#17133)
[numpy] fix argsort typo (#17150)
[numpy] add op round (#17175)
[numpy]Add op delete (#17023)
[numpy] add op flipud, fliplr (#17192)
[CI] Re-enable testing with numpy 1.18 (#17200)
[Numpy] Add broadcast_to scalar case (#17233)
[Numpy] Random.gamma() implemented (#16152)
[Numpy] add row_stack (=vstack) (#17171)
[Numpy] Add infra for performing constraint check (#17272)
porting numpy-compatible hstack to master and add dstack for interoperability (#17030)
adding asnumpy() to output of gather(implicitly called) to fix gather test in large vector and tensor tests (#17290)
[numpy] add op random.exponential (#17280)
[NumPy] Add NumPy support for norm (#17014)
[numpy]add op random.lognormal (#17415)
Add numpy random weibull operator (#17505)
[numpy] Add np.random.pareto and np.random.power (#17517)
[Numpy] Add sort op (#17393)
[numpy]implement exponential backward (#17401)
[Numpy] Where operator scalar version (#17249)
[numpy] add op matmul (#16990)
[numpy]add op random.logistic, random.gumbel (#17302)
[numpy][Do Not Review]add op insert (#16865)
[numpy] add op random.rayleigh (#17541)
[numpy] add fallback ops (#17609)
[numpy] add op pad (#17328)
[numpy] add op fabs, sometrue, round_ (#17619)
Add arange_like to npx (#16883)
try to move shape_array to npx (#16897)
support np.argsort (#16949)
np.broadcast_to extension (#17358)
support bitwise_and (#16861)
fix np.argmax/argmin output data type (#17476)
add op random.beta (#17390)
add op isnan isinf (#17535)
array_split pr (#17032)
Mixed data type binary ops (#16699)
randn implemented (#17141)
refactor and reduce float types for some functions, also add bitwise_xor (#16827)
any/all (#17087)
amax (#17176)
fix format (#17100)
add op empty_like, add nan_to_num to dispatch (#17169)
handle array_like fill_value for np.full; add unit test coverage (#17245)
add np.amin (#17538)
add npx.gather_nd (#17477)
add np.random.chisquare (#17524)
add polyval (#17416)
add isposinf isneginf isfinite (#17563)
Support broadcast assign for `npi_boolean_mask_assign_tensor` (#17131)
Implement Weibull backward (#17590)
support np.dsplit, fix some error msgs and corner cases for hsplit and vsplit, add interoperability tests for h/v/dsplit (#17478)
add np.product (#17489)
Implement np.random.pareto backward (#17607)
add np.ediff1d (#17624)
more support for boolean indexing and assign (#18352)
Fix einsum gradient (#18482)
[v1.7.x] Backport PRs of numpy features (#18653)
[v1.7.x] backport mixed type binary ops to v1.7.x (#18649)
revise activations (#18700)

Large tensor support

[Large Tensor] Add support to Random Sample & Pdf ops (#17445)
[Large Tensor] Add LT support for NN optimizers and 1 activation function (#17444)
[Large Tensor] Fixed SoftmaxActivation op (#17634)
[Large Tensor] Fixed col2im op (#17622)
[Large Tensor] Fixed Spatial Transformer op (#17617)
[Large Tensor] Fix ravel_multi_index op (#17644)
Sparse int64 Large tensor support (#16898)
Re-Enabling Large Tensor Nightly on GPU (#16164)
enabling build stage gpu_int64 to enable large tensor nightly runs (#17546)

MKL-DNN enhancement

MKLDNN FC : Add error info when mkldnn fc bias dimension is wrong (#16692)
[MKLDNN] support mkldnn gelu (#16710)
[MKLDNN] Fix int8 convolution/fc bias overflow (#16734)
[MKLDNN] use dim_t instead of int in slice/transpose operators (#16737)
Mkldnn fullyConnect bwd bug fix (#16890)
Revert Mkldnn fullyConnect bwd bug fix (#16890) (#16907)
[MKLDNN] Use MKLDNNRun (#16772)
[MKLDNN] mkldnn RNN operator enhancement (#17075)
[MKLDNN] enable MaxPooling with full pooling convention (#16860)
update mkldnn to v1.1.2 (#17165)
improve mkldnn doc (#17198)
[MKLDNN] Fix _copyto (#17173)
[MKLDNN] Support channel wise quantization for FullyConnected (#17187)
fixed seed for mkldnn test (#17386)
add mkldnn softmax backward (#17170)
cmake: copy dnnl headers to include/mkldnn (#17647)
[mkldnn]Mkldnn bn opt backport from master to 1.7x (#18009)
[v1.x] Update 3rdparty/mkldnn remote URL and pin to v1.3 (#17972) (#18033)
[v1.x] backport #17900 [MKLDNN] support using any format in pooling backward (#18067)
Static link MKL-DNN library (#16731)
Add large tensor nightly tests for MKL-DNN operators (#16184)
[MKL-DNN] Enable and Optimization for s8 eltwise_add (#16931)
[MKL-DNN] Enhance Quantization Method (#17161)
Static Build and CD for mxnet-cu102/mxnet-cu102mkl (#17074)
MKL-DNN RNN backward path enhancement (#17183)
cmake: check USE_OPENMP and pass proper MKL-DNN build flags (#17356)
update mkl to 2020.0 (#17355)
Enable MKL-DNN by default in pip packages (#16899)
Enable MKL-DNN FullyConnected backward (#17318)
Softmax primitive cache and in-place computation (#17152)
boolean_mask_assign with start_axis (#16886)
use identity_with_cast (#16913)
change error tolerance for bf16 bn (#18110)
[v1.x] Backport #17689 and #17884 to v1.x branch (#18064)
refactor codes and add an option to skip/check weight's version to reduce overhead (#17707) (#18039)
[v1.x] Backport #17702 and #17872 to v1.x branch (#18038)

...

[BUG FIX] Always preserve batch dimension in batches returned from dataloader (#16233)
Fix SliceChannel Type inference (#16748)
change _generate_op_module_signature get_module_file open with encoding=utf-8,it fix some encode error in Chinese windows system. (#16738)
Fix rtrue_divide grad (#16769)
fix inv test flakiness using random matrices generated by SVD (#16782)
[MXNET-1426] Fix the wrong result of sum, mean, argmin, argmax when inputs contain inf or nan (#16234)
Fix (#16781)
fix expand_dims fall back when input's ndim is 0 (#16837)
[fix] missing input log higher order. (#15331)
Fix IndentationError in setup.py (#16857)
Fix a few np issues (#16849)
Fix InferAttr/InferShapeAttr not calling inference for all nodes in a graph (#16836)
fix for enable model parallelism for non-fp32 data (#16683)
Fix NDArrayIter iteration bug when last_batch_handle='pad' (#16166)
Fix crashing on Windows in ObjectPool ~ctor (#16941)
Fix NDArrayIter cant pad when size is large (#17001)
fix axis=-1 bug (#17016)
Fix CUDNN detection for CMake build (#17019)
Fix omp assert issue (#17039)
mshadow: fix vector access (#17021)
[BUGFIX] Fix race condition in kvstore.pushpull (#17007)
[BUGFIX] Fix trainer param order (#17068)
[BugFix] fix filter channel calculation in ModulatedDeformableConvV2 (#17070)
Fix reshape interoperability test (#17155)
fix norm sparse fallback (#17149)
fix py27 quantization (#17153)
fix int8 add ut (#17166)
Fix and clean up Ubuntu build from source instructions (#17229)
fix lstm layer with projection save params (#17266)
Fix rendering of ubuntu_setup.md codeblocks (#17294)
Fix #17267, add expected and got datatype for concat error msgs (#17271)
[BUGFIX] fix model zoo parallel download (#17372)
* fix use int8, uint8, int32, int64 (#17188)
[Fix] Add ctx to the original ndarray and revise the usage of context to ctx (#16819)
Fix ndarray indexing bug (#16895)
fix requantize flaky test (#16709)
Initial checkin (#16856)
Fix flakey test_ndarray.py:test_reduce (#17312)
fix flaky test: boolean index and fix bugs (#17222)
Fix IOT Devices section of Get Started page (#17326)
add logic for no batch size while getting data arrays from executors (#17772) (#18122)
Fix reverse shape inference in LayerNorm (#17683)
fix full and full_like when input is boolean (#17668)
Fix MBCC inference (#17660)
Additional fix for vector access. (#17230)
Cherrypick Fix nightly large_vector test caused by incorrect with_seed path (#18178) (#18220)
[1.7] Pass args fix3 (#18237)
fixing batch_norm and layer_norm for large tensors (#17805) (#18261)
[1.7.x] Backport of LSTM and GRU fix (#17898) and RNN op (#17632) (#18316)
[v1.7.x] backport #18500 - [Bug Fixed] Fix batch norm when grad_req is `add` (#18517)
Fix the monitor_callback invalid issue during calibration with variable input shapes (#18632) (#18703)

Front end Front end API

Fix the problem in printing feature in c++ API examples : feature_extract (#15686)
updating MXNet version to 1.6.0 in base.h for C APIs (#16905)
[API] unified API for custom kvstores (#17010)
fix parameter names in the estimator api (#17051)
adding docs for 64bit C APIs of large tensor (#17309)
Add API docs to INT64 APIs (#16617)

...

support mixed-precision true_divide (#16711)
Try to fix CI (#16908)
mixed precision for power (#16859)
Fix desired precision for test_ndarray.py:test_reduce (#16992)
[reproducibility] multi_sum_sq review, AtomicAdd removal (#17002)
fix precision problem in linalg_solve, linalg_tensorinv, linalg_cholesky op test (#16981)
grouping large array tests based on type and updating nightly CI function (#17305)
[LICENSE] fix cpp predcit license (#17377)
[CI] Fix static build pipeline (#17474)
skipping tests that cannot fit in nightly CI machine corrected imports (#17450)
Update Windows CI scripts to use syntax compatible with Win 2019 server powershell. (#17526)
Fix Non-ASCII character in docstring (#17600)
[CI] Follow redirects when downloading apache-maven-3.3.9-bin.tar.gz (#17608)
[CI] Upgrade sphinx and autodocsumm (#17594)
Reduce load on CI due to excessive log flood (#17629)
Enable users to specify BLAS (#17648)
[CI] Add AMI id to instance info on builds (#17649)
[v1.7.x] Backport staggered CI builds (#17999 & #18119) (#18142)
[v1.7.x] Backport #17177 to 1.7.x (Fix incorrect calculation results when the C locale is set to a locale that uses commas as the decimal separator) (#18147)
Fix formatting and typos in CD README.md (#16703)
[CD] dynamic libmxet pipeline fix + small fixes (#16966)
[CD] enable s3 publish for nightly builds in cd (#17112)
[CD] fix CD pipeline (#17259)
[CD] update publish path (#17453)
fix CD and remove leftover from #15990 (#17551)
Fix nightly build (#16773)
)
Fix nightly build (#16773)
Update pypi_publish.py to disable nighlty build upload to Pypi (#17082)
[v1.7.x] update jetson dockerfile to support CUDA 10.0 (#18339)
Remove manually created symbolic link to ninja-build (#18437) (#18456)
Increase staggered build timeout to 180 min (#18568) (#18585Update pypi_publish.py to disable nighlty build upload to Pypi (#17082)

License

Don't relicense FindCUDAToolkit.cmake (#17334)
fix license and copyright issues (#17364)
Update ps-lite LICENSE (#17351)
remove unused file with license issue (#17371)
Update LICENSE for fonts (#17365)
license np_einsum file under bsd (#17367)
Update Apache License for mshadow (#18109) (#18134)
Julia: remove downloading of the non-ASF binary build (#18489) (#18502)
Add missing license header for md files (#18541)
[v1.7.x]License checker enhancement (#18478)

Miscellaneous changes

Link fixes4 (#16764)
Refactoring names for mxnet version of nnvm to avoid conflicting with the original tvm/nnvm. (#15303)
minor typo fix (#17008)
Add micro averaging strategy to pearsonr metric (#16878)
introduce gradient update handler to the base estimator (#16900)
fix latency calculation and print issue (#17217)
add inference benchmark script (#16978)
change the wording and log level to be more in line with the general use (#16626)
Updated logos. (#16719)
Pinning rvm version to satisfy Jekyll build (#18016)
Workaround gnu_tls handshake error on Ubuntu 14.04 Nvidia Docker (#18044)

...

Page tree

Versions Compared

Old Version 3

New Version Current

Key

New features

MXNet Extensions: custom operators, partitioning, and graph passes

OpPerf utility enabled in the

binary distribution

MKL-DNN

MKL-DNN as the default CPU backend in binary distribution

Branding change to DNNL

Feature improvements

Numpy compatible interface(experimental)

Large tensor support

MKL-DNN enhancement

Front end Front end API

License

Miscellaneous changes

Page tree

Page History

Versions Compared

Old Version 3

New Version Current

Key

New features

MXNet Extensions: custom operators, partitioning, and graph passes

OpPerf utility enabled in the

binary distribution

MKL-DNN

MKL-DNN as the default CPU backend in binary distribution

Branding change to DNNL

Feature improvements

Numpy compatible interface(experimental)

Large tensor support

MKL-DNN enhancement

Front end Front end API

License

Miscellaneous changes