Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Have an internalized graph library that can be maintained as part as MXNet. This makes contributions easier and helps evolve the graph without potential design conflicts for a library that is now part of other project with different goals and use cases than MXNet (TVM). In this way we can evolve the library and fit it for MXNet use cases.
  • Use a class passed as template argument to the graph for node and edge attributes, where the graph attributes are stored in a maintainable, type safe and compile-time checked way.
  • The nodes of the graph also have templated Node data and attributes class which decouple the implementation of generic graph algorithms, traversal etc, from their usage. This will allow for more data fields and structured data and classes to be added to the graph nodes and edges in a propper way.
  • The graph is type safe,extensible, reusable, and is debugging friendly.
  • Developers can easily extend the graph implementing logic to dump and process the graph in a generic way without having to deal with too many implementation details.
  • Generic facilities can be added such as serialization, optimization passes, operation fusion passes etc. which will add tremendous value to MXNet as a flexible framework going forward in the next steps of deep learning developments.
  • AGInfo is simplified and such a vital part can be part of the Node attributes class (NodeData in the example below).
  • This change would help support higher order gradient calculations, since it needs to do passes on the graph that generate additional backward nodes recursively.
  • Better framework for advanced graph processing. (See other proposals affecting graph processing in this wiki node).
  • Eliminate the need for an additional indexed graph data structure, saving on memory, CPU cost and complexity. Eliminate the need for equivalent data structures such as Node, NodePtr, NodeEntry and coalesce into a single one representing either a variable or an op.


Examples of design changes:

...

class MXNetNodeData {
vector<NDArray*> inputs;
vector<NDArray*> outputs;
Autograd autograd;
Operator* op;
string name;
[...]
bool is_operator();
bool is_variable();
bool has_autograd();
};

class EdgeData {};
Graph<MXNetNodedata
Graph<MXNetNodeData, EdgeData, size_t, uint16_t> g;

g.DFSVisit([](const MXNetNodeData& x) { ... });



For example, a piece of code that would be greatly improved by this would be Imperative::Backward. This code as of today is breaking encapsulation of the classes that it's using. Holding indices to internal data structures in the graph as well as using node ids, pointers to arrays and variables. It's extremely complex to follow and reason. It can be broken into different passes that augment the graph and add nodes or mutate it when doing shape inference, adding backward nodes etc.


While ideally for correctness one would prefer immutable data structures, which make the code much easier to reason about and verify correctness, this is not always possible with regards to performance considerations. But still this is a good principle to keep in mind when laying out the design whenever is possible. 


In the same fashion, common functions such as DFSVisit, would be part of the graph interface and take a NodeData class as argument for the visiting function. This would be similar as it's now, not requiring any major change.

...