Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Figure 2. Flowchart for subgraph replace.

Flowchart for our parallel op pass as Fig.2 shows.

...

We implement it based on

...

subgraph API.

...

SgParallelOpSelector

...

inherits

...

from SubgraphSelector is used to find the parallel structure,

...

and SgParallelOpProperty inherits from SubgraphProperty is to connect its input/output entry.

The key bock in Fig.2 is Filter which is used

...

check whether the finding parallel structure meet some conditions. For example

...

, we must make sure OP is thread safe or it may fails during simultaneous execution by multiple threads. MKL-DNN OP will be thread safe after version 1.0. But now, we need to maintain a whitelist for thread safe OPs. There are some other conditions such as paralleled Node number >= threshold or

...

OPs to be paralleled will cause performance drop based on the parameters we got.

...

Environment variable may be add by user to add/remove whitelists in future release.

We implement paralle_op based on subgraph API. The main body of parallel op forward function is accelerate by OMP multithread as Figure3. This means origin OP forward function should be thread safe. As mentioned in step 4, OP whitelist is used to check if OP support thread safe. And whitelist can be add/remove in future by setting environment variablesWhen do inference, several op runs parallel. In our wide deep model, 26 embedding forward function are called simultaneously. By this parallel in OP level, performance is improved a lot.

Figure 3. Main body of parallel OP forward.

Figure 3. Main body of parallel OP forward.

...