Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The primary goal is to improve the inefficient OPs performance by paralleling them in OP level. Any OP inefficient and independent OPs. In this proposal, only the situation that OPs comes to/from one OP can be paralleled if it can benefit from high level parallel.
Another goal is that this modification should be is covered and other hierarchical patterns will be considered in the future. The change in this proposal will grantee that the modification is transparent to users and should does not change existing scripts, models. Active one environment variable will make it works no matter on CPU, GPU etc.
All we need to do is adding a pass for current backend

This approach can work for all backends by sharing the same subgraph path but in practice some adjusts in the interfaces and implementations are still needed. Thus, in the first step, the CPU and MKLDNN backend are enabled.

Proposed Approach

Figure 1. Example for parallel embedding

Figure 1. Example for parallel embedding

...