Page History

...

The initial set of experiments were conducted with linear regression model on YearPredictionMSD dataset, which contains more than 40, 000 samples. The results of using SVRG optimization showed strong guarantees of faster convergence compared to SGD. A more detailed analysis of experiment results can be found in Benchmark section.

Key Characteristics of SVRG:

Explicit variance reduction
Ability to use relatively large learning rate compared to SGD, which leads to faster convergence.

Expected Deliverables

The goal is to implement an MXNet Python Module that implements SVRG optimization technique.

Tenets

Minimize the surface footprint by implementing a complete SVRGModule
From a user's perspective, using the SVRG Module should be similar to using MXNet Python Module API, except the underlying optimization technique will be SVRG. Minimize the differences of the external APIs of SVRGModule from the Module API.
SVRG Module should seamlessly support both dense and sparse data, run on CPU and GPU instances on single machine and in distributing setting.

...

Page tree

Versions Compared

Old Version 1

New Version Current

Key

Key Characteristics of SVRG: