Page History

...

Code Block

language	py

# Configurations
warmup = 25
runs = 50
run_backward = True

# Operator to benchmark
F = mx.nd.add

# Prepare data for the operator
lhs = mx.nd.ones(shape=(1024, 1024))
rhs = mx.nd.ones(shape=(1024, 1024))
lhs.attach_grad()
rhs.attach_grad()
mx.nd.waitall()

# Warmup
print("Warming up....")
for _ in range(warmup):
    with mx.autograd.record():
        res = mx.nd.add(lhs, rhs)
    res.backward()
    mx.nd.waitall()
print("Done warming up....")

# Run Performance Runs
print("Running performance runs....")
profiler.set_config(profile_all=True, aggregate_stats=True)
# Start Profiler
profiler.set_state('run')
for _ in range(runs):
    with mx.autograd.record():
        res = mx.nd.add(lhs1, rhs1)
    res.backward()
    mx.nd.waitall()

# Stop Profiler 
profiler.set_state('stop')

# Fetch Results from Profiler
# We will add 2a new APIsAPI in Profiler - profiler.get_summary(), profiler.reset(reset=True)
# profiler.get_summary() => will beReturn a JSON string representing the output as shown below.
# profiler.reset()                        => Resets all the counter in the current profiler.

print("Done Running performance runs....")
print(profiler.dumps(reset=True))

How to capture Time?

We will be using MXNet profiler

Pros

No need to write 1 class per operator to set up a performance test. Whenever a new operator is created, developer needs to add a `run_performance_test(..)` line with a list of inputs to run performance tests. A generic utility will handle the execution.
Less code, easy to maintain.
More control for users - default inputs, random inputs, specific user defined inputs.
Deterministic and better suited for performance benchmarks, reproducibility and CI integration.
More accurate benchmark results - Time and Memory because we use MXNet profiler.
With Python interface:
1. Easy to maintain and develop.
2. Reflects the performance as seen by the users. (Majority users using Python interface)
3. Fastest way to get performance tests in place. We do not have any tests in place as of today.

...

Different operator will have different input names. For example, see above, add operator requires tensors with name lhs, rhs. However, Conv2D operator requires a tensor with data. The base performance executor utility will need to understand it and create tensors appropriately i.e., If it is one single executor, generalization across operator performance may make logic complex to manage.
Not easily extensible:
1. Hard to integrated with property based testing libraries like Hypothesis, to randomly generate test cases with different tensor shapes.
It is ideal to capture performance close to Kernel. Call from Python operator APIs may hide performance regression when operator computation is small.

Addition of new Module

We propose to add this utility as a new We propose to add this utility as a new module (opperf) under incubator-mxnet/benchmark as "incubator-mxnet/benchmark/opperf". Note that, this does not generate any user facing APIs, this is a utility under incubator-mxnet/benchmark folder for general use by community.

Addition of new API

We propose to add a new API to MXNet Profiler for easily fetching operator profile for processing programmatically.

1) mxnet.profiler.get_summary(reset=False)

Current Behavior:

Users can either use `mxnet.profiler.dump()` to output the profiler as a JSON file. Or, use `mxnet.profiler.dumps(reset=False)` API to print the summary on console.

Suggested Addition:

In order to enable easy programmatic usage of MXNet profiler output, we propose to introduce a new API to return the summary as JSON string. This enables users to run profiler, get summary output, perform analysis programmatically.

Code Block

language	py

mxnet.profiler.get_summary(reset=False)
	"""Gets current profiler summary as a JSON string. If reset is True, resets all the aggregate statistics collected up to this point i.e., it clears all    the profiler counters.
    
	Parameters:
    -----------
    reset: boolean, If True, resets all profiler statistics collected up to this point.
    """

We can visualize the output of this API as a JSON representation of the output from `mxnet.profiler.dumps(reset=False)` API as shown below.

Image Added

API / User Experience

...

Page tree

Versions Compared

Old Version 11

New Version 12

Key

How to capture Time?

Addition of new Module

Addition of new Module

Addition of new API

1) mxnet.profiler.get_summary(reset=False)

API / User Experience