Description
tl;dr When we measure small things we have to remove all overheads which have impact. Wrapping with a method and block call is quite big overhead for most of benchmarks in this repository.
Suppose you want compare performance of "2 + 3" vs "2 * 3" calls. If you do it with a approach used in this repository you will get this results:
require "benchmark/ips"
def slow
2 * 2
end
def fast
2 + 2
end
Benchmark.ips do |x|
x.report("2 * 2") { slow }
x.report("2 + 2") { fast }
x.compare!
end
(simplified output)
Comparison:
2 + 2: 8304760.0 i/s
2 * 2: 7535516.6 i/s - 1.10x slower
But there is one problem. Calling a method + surrounding block ({ slow }
) has bigger overhead than calling Fixnum#+
or Fixnum#*
itself. Thea easiest way to observe it is to repeat benchmarked operations in one call. Like this:
require "benchmark/ips"
def slow
2*2; 2*2; 2*2; 2*2; 2*2; 2*2; 2*2; 2*2; 2*2; 2*2;
end
def fast
2+2; 2+2; 2+2; 2+2; 2+2; 2+2; 2+2; 2+2; 2+2; 2+2;
end
Benchmark.ips do |x|
x.report("2 * 2") { slow }
x.report("2 + 2") { fast }
x.compare!
end
Comparison:
2 + 2: 4680545.3 i/s
2 * 2: 3468681.3 i/s - 1.35x slower
See how results changed? Writing our benchmarks in this way would be quite problematic. Fortunately benchmark-ips gem has answer for that. Benchmark::IPS::Job#report
method allows to pass string which will be compiled before benchmark is run. Passing right string allows to measure it properly:
require "benchmark/ips"
Benchmark.ips do |x|
x.report("2 * 2", "2 * 2;" * 1_000)
x.report("2 + 2", "2 + 2;" * 1_000)
x.compare!
end
Comparison:
2 + 2: 91567.5 i/s
2 * 2: 57994.4 i/s - 1.58x slower
This is greatly explained here: https://docs.omniref.com/ruby/2.2.1/symbols/Benchmark/bm#annotation=4095926&line=182
How does this affect fast-ruby benchmarks? All benchmarks that call small things are flawed. One of them is Array#length vs Array#size vs Array#count
benchmark. Here is the original code and result obtained on my computer:
require 'benchmark/ips'
ARRAY = [*1..100]
Benchmark.ips do |x|
x.report("Array#length") { ARRAY.length }
x.report("Array#size") { ARRAY.size }
x.report("Array#count") { ARRAY.count }
x.compare!
end
Comparison:
Array#size: 8679483.2 i/s
Array#length: 8664450.7 i/s - 1.00x slower
Array#count: 7237299.5 i/s - 1.20x slower
The same benchmark measure with described approach gives different numbers:
require 'benchmark/ips'
ARRAY = [*1..100]
Benchmark.ips do |x|
x.report("Array#length", "ARRAY.length;" * 1_000)
x.report("Array#size", "ARRAY.size;" * 1_000)
x.report("Array#count", "ARRAY.count;" * 1_000)
x.compare!
end
Comparison:
Array#size: 113902.4 i/s
Array#length: 113655.9 i/s - 1.00x slower
Array#count: 28753.4 i/s - 3.96x slower
Difference: 1.20x slower vs 3.96x slower.
My guess is that it affects most of benchmarks.