- Reports both, cpu and gpu time.
- Produces plots of the execution time, speedup or custom metrics.
- Saves the results in csv files.
- Allows automatic performance comparison with numpy or numpy API compat libraries.
- TODO: CPU routines profiling.
- TODO: Profile kernels using nvprof.
- TODO: Performance regression detection.
Similarly to pytest, benchmarks are stored in bench_*.py files.
Inside a bench file class definitions with Benchmark on their name are
considered as such.
There are 3 special methods in the benchmark
setup used to generate the actual inputs to the routines and cupy memory allocations.
teardown used to clean up the state and free the allocated memory.
args_key which returns a string to name the current benchmark case (parameters combination)
setup and teardown are called before and after every function in the class starting with
time_. These time_ functions are in charge of doing the actual benchmark and they will be
executed multiple times to get statistically significant results. Therefore object state modification
should be avoided.
Benchmars are parametrized by using the params class attribute. This attribute is a dictionary
whose entries are each a list of possible values for that parameter name. The cross-product of all the
entries is calculated before running the benchmark, and every benchmark is run for all the possible
combinations of the parameters. The parameters are set before calling the setup functions and the
current value is accessed with self.parameter_name.
Benchmarks should inherit from one of these base classes.
cupy_prof.benchmark.CupyBenchmarkBasic benchmark class for testing cupy only that plots a logarithmic time graph of both the cpu and gpu time.NumpyCompareBenchmarkperforms a comparison within numpy and cupy for the specified routine. numpy or cupy namespaces are accessed in the benchmark withself.xpas in chainer tests. It aditionally calculates the speedup and plot it too.
Benchmarks are plotted according the _plots class variable in the benchmarks.
_plots = [{'facet': {'col': 'name', 'hue': 'backend'},
'plot': 'line',
'x': 'key',
'y': 'time',
'yscale': 'log'},
{'facet': {'col': 'name', 'hue': 'xp'},
'plot': 'bar',
'x': 'key',
'y': 'speedup'}]_plots is a list with all the graphs that will be generated for the class.
the facet entry is in charge of subplotting all the def bench_ results in
different sub-graphs. The first example says that for this facet, the column is the
benchmark name (method name in the class definition) and the hue, or the legend is the
backend, with is either numpy, cupy-gpu, cupy-cpu. For each of the graph, the x
is the key values obtained by calling the benchmark args_key method. and the y is the execution time.
This values can be altered in the setup to generate different kinds of graphs.
$ python prof.py benchmarks
Alternatively, if a directory is specified, it will collect all the benchmarks with the file name
starting with bench_ as in pytest.
python prof.py --repo /home/ecastill/em-cupy --commits master v7 --plot -- benchmarks/bench_ufunc_cupy.py
Is it possible to compare different commits or branches of a repository.
This script will automatically checkout the branch and compile cupy, but it will install it in a virtual environment
virtualenv is required for this functionaility.
--repo can specify a common repo for all the commits, or a list of repositories, one per-commit, so that build time
can be reduced.