Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The initial set of experiments were conducted with linear regression model on YearPredictionMSD dataset, which contains more than 40, 000 samples. The results of using SVRG optimization showed strong guarantees of faster convergence compared to SGD. A more detailed analysis of experiment results can be found in Benchmark section.

Key Characteristics of SVRG:

  • Explicit variance reduction 
  • Ability to use relatively large learning rate compared to SGD, which leads to faster convergence.

Expected Deliverables

The goal is to implement an MXNet Python Module that implements SVRG optimization technique.

Tenets

  • Minimize the surface footprint by implementing a complete SVRGModule
  • From a user's perspective, using the SVRG Module should be similar to using MXNet Python Module API, except the underlying optimization technique will be SVRG. Minimize the differences of the external APIs of SVRGModule from the Module API.
  • SVRG Module should seamlessly support both dense and sparse data, run on CPU and GPU instances on single machine and in distributing setting. 

...