Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This wiki page is intended to help you select the epochs and number_of_iterations parameters for training multiple deep learning models in parallel with Apache MADlib and Greenplum database.  Because Greenplum is a distributed database, the concept of passes over the data is different than in single node systems.

...

number of passes over the data  = epochs * number_of_iterations

The Keras fit parameter epochs means the number of passes over the data in each Greenplum segment within an iteration, so Keras epochs actually refers to sub-epochs in MADlib/Greenplum.  The number_of_iterations parameter in the MADlib fit function is the outer loop.  

...