Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
outlinetrue

Apache MXNet 1.5.0 is a patch minor release that includes fixes for critical bugs and performance regressions introduced with 1.4.0 release. backward compatible changes.
As MXNet follows semver, this release includes backwards-compatible fixes and new features only. 
New features can be introduced in the next minor release, MXNet 1.5.0 that is being planned as well.

Discussion on the roadmap: https://github.com/apache/incubator-mxnet/issues/14619

Release Manager: Junru ShaoLai Wei
Release shepherd: Sheng Zha

Scope

Scope is still open - please add!Reference:  MXNet 1.5.0 Roadmap

IssuePRsContributor(s)Notes
Scala/Java API memory leaks
AMPhttps://github.com/apache/incubator-mxnet/pull/14173@ptrendxTo move forward: https://github.com/apache/incubator-mxnet/pull/14173#pullrequestreview-235846341
MKLDNN Quantizationhttps://github.com/apache/incubator-mxnet/pull/
14586

Andrew Ayres
Qing Lan

Tutorial update that fixes nightly CI build
14819@ZhennanQinMerged
FP32 optimization#14914#14818#14893#14877#14783 @ciyongch, @TaoLv, @juliusshufa, @yinghu5, @TaoLvlast PR left under review: #14893
bug fix for infer shape partial#14869@royweilaiunder review
dependency upgrade for mxnet#14950#14887#14588, #14988@stu1130update to latest cuda 10.1 and cudnn 7.5.1 for mxnet and mxnet CI
Conversion from FP32 to Mixed Precision Modelshttps://github.com/apache/incubator-mxnet/issues/14584@anirudh2290  

Depend on AMP PR

Conversion from FP32 to Mixed Precision Models

Moved to 1.6 scope

MKLDNN RNN Inference Integration(fp32 LSTM and vRNN with tanh and relu)

#14713

@lihaofd, Tao Lv, @pengzhao-intel

Improves performance of certain operators used in RNN models
https://github.com/dmlc/gluon-nlp/issues/706

NA


haibinresolved at gluon nlp side by https://github.com/dmlc/gluon-nlp/pull/710
https://github.com/apache/incubator-mxnet/issues/15028https://github.com/apache/incubator-mxnet/pull/
14556Sergey SokolovTidy up storage allocation and deallocation
15039amp tutorial test failed. blocking nightly test
https://github.com/apache/incubator-mxnet/issues/15029https://github.com/apache/incubator-mxnet/pull/
14480Yuxi HuFixes memory leaks: #13951, #14358Add MXEnginePushAsync and MXEnginePushSync C APIs
15041@zheng-da @apeforestMerged
0 size tensor patch for quantization https://github.com/apache/incubator-mxnet/pull/15031@ciyongchMerged
https://github.com/apache/incubator-mxnet/issues/15034https://github.com/apache/incubator-mxnet/pull/
14615
15056
Yuxi HuFixes GCC incompatibility issue for running MXNet with HorovodAdd pin_device_id option to Gluon DataLoader
@DickJC123 @lihaofdbrough by previous change on RNN (
Change RNN OP to stateful (#14476))

https://github.com/apache/incubator-mxnet/issues/14954https://github.com/apache/incubator-mxnet/pull/
14136
14998
Yuxi HuFixes out of memory error when using Gluon DataLoader with CPUPinned memory during distributed training on multiple GPUs with MXNet + Horovod
@reminisce

Issues during 1.4.1 release vote:

https://github.com/apache/incubator-mxnet/issues/14936

https://lists.apache.org/thread.html/0cb2131f2506661a884f89d8419aba08298cbc50aaeeda06e41e530f@%3Cdev.mxnet.apache.org%3E

https://github.com/apache/incubator-mxnet/pull/15127

Set idx2name for Optimizer object

https://github.com/apache/incubator-mxnet/pull

/14703Yuxi HuFixes issue that may affect model convergence when using Module APITranspose operator performance regression

/15128

@szha @roywei

CI build failures:

https://github.com/apache/incubator-mxnet/issues/15084

https://github.com/apache/incubator-mxnet/issues/15028

https://github.com/apache/incubator-mxnet/issues/14981

https://github.com/apache/incubator-mxnet/issues/15152

https://github.com/apache/incubator-mxnet/pull/

14570Lin YuanFixes perf degradation: #14496Import failure for GPU-enabled MXNet packages on machines without GPU

15099


https://github.com/apache/incubator-mxnet/pull/15141

https://github.com/apache/incubator-mxnet/pull/15156

the large tensor nightly test is disabled due to CI OOM, but tests are passing locally. Not a blocker for the release anymore.


Other test failures have been fixed.

Autograd Function bug  https://github.com/apache/incubator-mxnet/
pull
issues/
13764Przemyslaw Tredak
15183  Da Zheng

Release Timeline

Following timeline is based on everything goes well.(Added some buffer time)

StepTasksGoalActualComments
Code Freeze and release startTrack ongoing PRsFri 04/26/2019Once 1.4.1 is released






Cut the release branchCheck license headersMon 045/2917/2019Depending on stability06/03/2019

Make code changes with necessary version updatesmaster version was already on 1.4.0



Cut the release branch



Update the version on master


Test the release and tag the releaseNightly test, Jenkins CIRAT checkhttp:5//jenkins.mxnet-ci.amazon-ml.com/job/incubator-mxnet/job/v1.4.x/27/201906/04/2019

RAT check



Tag RC0







Package artifacts and validateCreate release artifacts



Validate release package



Test release package



Scala release process







Begin apache votingStart vote on dev@Mon 0406/2907/2019 - Thu 05/02/201906/08/2019rc0 and rc1 failed, rc2 passed on 07/10/2019: https://lists.apache.org/thread.html/641cf0fddce623ff352ba9c7655938c0d16337bae4a8d290956ea130@%3Cdev.mxnet.apache.org%3E

Start vote on general@
Fri 0507/03/2019 - Tue 05/07/201910/2019vote on general passed on 07/18/2019 https://lists.apache.org/thread.html/5365cdab7dee08d220e32decc76fd54aa05e29bc891c416828cb64d2@%3Cgeneral.incubator.apache.org%3E





Finalizing and posting the releaseCreate the final release tag on github6/14/2019


Rename, resign and upload the src tar to final dir



Update the website using tag



Release the official pip package



Release the official docker images



After 24 hrs, validate the packages are uploaded



Draft the offical announce email and review



Send out the email on announce@
07/29/2019https://lists.apache.org/thread.html/647e7f18217514ea06344f0f713798c9aac0adcdd16addbb044aa6cd@%3Cdev.mxnet.apache.org%3E

Update the apache blog



update the aws blog



send internal announcement







CI status:

Release Note

This patch release includes the following bug fixes:

  • [v1.5.0] Java bug-fix cherry pick (#14834)
  • Use DEFAULT macro in C APIs (#14767) (#14789)
  • Set idx2name for Optimizer object (#14703) (#14772)
  • Add pin_device_id option to Gluon DataLoader (#14136) (#14771)
  • Tidy up storage allocation and deallocation (#14480) (#14768)
  • Add MXEnginePushAsync and MXEnginePushSync C APIs (#14615) (#14770)
  • Less cudaGet/SetDevice calls in Gluon execution (#13764)

  • Fix nightly build of 1.4.x (#14556)

  • Memory fixes. Resolves #10867, and resolves #14080 (#14372) (#14586)

  • Fixes for data links (#14526)
  • v1.4.x: Backport of Windows CI Fixes (#14420)Passing