diff options
author | Shahriar Rouf <nafi@google.com> | 2024-02-07 13:58:56 -0800 |
---|---|---|
committer | Copybara-Service <copybara-worker@google.com> | 2024-02-07 13:59:46 -0800 |
commit | 3e59efa2ad1d1777257bd3b1845d5acc4a931687 (patch) | |
tree | 641c393b5a52f29e8cdf712804e0102c2395ae25 /absl/hash/internal/low_level_hash.h | |
parent | f4c713f55e0e7d12ae03204c027364dd87719e26 (diff) | |
download | abseil-3e59efa2ad1d1777257bd3b1845d5acc4a931687.tar.gz abseil-3e59efa2ad1d1777257bd3b1845d5acc4a931687.tar.bz2 abseil-3e59efa2ad1d1777257bd3b1845d5acc4a931687.zip |
Optimize `absl::Hash` by making `LowLevelHash` faster.
Throughput of the 64 byte chunk loop inside `LowLevelHash` (or now in `LowLevelHashLenGt16`) gets limited by the loop carried dependency on `current_state`. By using 4 states instead of 2, we can reduce this duration by 1 cycle. On Skylake, it is reduced from 9 cycles to 8 cycles (12.5% faster asymptotically).
To see the reduction in a simplified version of `LowLevelHash` implementation on Skylake:
* Before: https://godbolt.org/z/Tcj9vsGax, llvm-mca (https://godbolt.org/z/3o78Msr63) shows 9 cycles / iteration.
* After: https://godbolt.org/z/q4GM4EjPr, llvm-mca (https://godbolt.org/z/W5d1KEMzq) shows 8 cycles / iteration.
* This CL is removing 1 xor (1 cycle) per iteration from the critical path.
A block for 32 byte chunk is also added.
Finally, just before returning, `Mix` is called 1 time instead of twice.
PiperOrigin-RevId: 605090653
Change-Id: Ib7517ebb8bef7484066cd14cf41a943953e93377
Diffstat (limited to 'absl/hash/internal/low_level_hash.h')
-rw-r--r-- | absl/hash/internal/low_level_hash.h | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/absl/hash/internal/low_level_hash.h b/absl/hash/internal/low_level_hash.h index 439968aa..d460e351 100644 --- a/absl/hash/internal/low_level_hash.h +++ b/absl/hash/internal/low_level_hash.h @@ -43,6 +43,10 @@ namespace hash_internal { uint64_t LowLevelHash(const void* data, size_t len, uint64_t seed, const uint64_t salt[5]); +// Same as above except the length must be greater than 16. +uint64_t LowLevelHashLenGt16(const void* data, size_t len, uint64_t seed, + const uint64_t salt[5]); + } // namespace hash_internal ABSL_NAMESPACE_END } // namespace absl |