When compare two results it would be helpful to output a probability of one result be faster then other. If `times1` and `times2` are sets of measured times, then the probability of the first benchmark being faster than the second one is estimated as: ```python sum(x < y for x in times1 for y in times2)/len(times1)/len(times2) ``` Actually you can sort one of sets and use binary search for optimization.