You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It appears to me that benchmark.cpp in not really measuing what you/I expect/think.
It calls std::string::size() for every hash call. This can be relatively expensive (probably due to the short-string optimisation). [The mean length of words in /usr/share/dict/words is ~10, which is a short-string on 64-bit.]
If I run your unmodified benchmark.cpp on my (aging) i7 Mac I get something like:
|wyhash |122.21 |13.28 |16.87 |
If I modify benchmark.cpp to first build then use a vector of std::string_view to avoid calling std::string::size() I get:
|wyhash |168.38 |13.10 |16.89 |
This is about 37% more short hashes/µs!
Both were compiled with Clang using:
c++ benchmark.cpp -o benchmark -O3 -Wall -std=c++17 -march=corei7 -mavx -fno-stack-protector
But -std=c++11 works equally well.
Below is the (short) patch I used to change this.
The section from line 50 - 60 is less essential, but does improve the bulk results.
Hello,
Thanks for writing wyhash.
It appears to me that
benchmark.cpp
in not really measuing what you/I expect/think.It calls
std::string::size()
for every hash call. This can be relatively expensive (probably due to the short-string optimisation). [The mean length of words in /usr/share/dict/words is ~10, which is a short-string on 64-bit.]If I run your unmodified
benchmark.cpp
on my (aging) i7 Mac I get something like:If I modify
benchmark.cpp
to first build then use avector
ofstd::string_view
to avoid callingstd::string::size()
I get:This is about 37% more short hashes/µs!
Both were compiled with Clang using:
But
-std=c++11
works equally well.Below is the (short) patch I used to change this.
The section from line 50 - 60 is less essential, but does improve the
bulk
results.BTW If I’m reading the code correctly, the
GB/s
in thebulk
results seems to be1000 * 1024 * 1024
. Which is not a GB or a GiB, but a thousand MB.Please feel free to use this patch how & if you like.
PS I also found that adding something like the following immediately above line 99
#ifndef XXH3
to be useful:It allows adding something like
-DREPS=5
to the compile command to ease testing.The text was updated successfully, but these errors were encountered: