| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
void* to T* is well defined
PiperOrigin-RevId: 648352837
Change-Id: I082cd0c007706ae8baa8f26cdc85d51b69bffd54
|
|
|
|
|
|
|
|
|
| |
`absl::erase_if`.
Since we have potential plans to use this function more widely including `absl::c_for_each`, we need to have good error detection.
PiperOrigin-RevId: 647236725
Change-Id: I5035bfb8cef24f80f1bbed83a42380e57d84e428
|
|
|
|
|
|
|
|
|
| |
IterateOverFullSlots.
We decided to not allow reentrance in absl::erase_if and absl::container_internal::c_for_each_fast.
PiperOrigin-RevId: 645273965
Change-Id: I75dfc73b93ba10f0e051bf0833723af887e1bb36
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function is aimed to achieve faster iteration through the entire hash table.
It is not ready to be used by the public and stays in `container_internal` namespace.
Differences with `absl::c_for_each`:
1. No guarantees on order of iteration. Although for the hash table it is partially not guaranteed already. But we do not even guarantee that it is the same order as in the for loop range. De facto, the order is the same at the moment.
2. No mutating reentrance is allowed. Most notably erasing from the hash_table is not allowed.
Based on microbenchmarks, there are following conclusions:
1. c_for_each_fast is clearly faster on big tables with 20-60% speedup.
2. Microbenchmarks show regression on a full small table without any empty slots.
We should avoid recommending that for small tables.
3. It seems reasonable to use `c_for_each_fast` in places, where `skip_empty_or_deleted` has significant GCU usage. `skip_empty_or_deleted` usage signals that there are "gaps" between elements, so `c_for_each_fast` should be an improvement.
PiperOrigin-RevId: 645142512
Change-Id: I279886b8c8b2545504c2bf7e037d27b2545e044d
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Predicates should generally have no side effects since we do not guarantee the order or quantity for the calls.
In this change we forbid one specific side effect: modification of the table we are iterating over.
As a positive effect we have performance improvements (less computations and less branches).
```
name old cpu/op new cpu/op delta
BM_EraseIf/num_elements:10/num_erased:0 3.02ns ± 2% 2.79ns ± 3% -7.44% (p=0.000 n=35+37)
BM_EraseIf/num_elements:1000/num_erased:0 2.41ns ± 5% 2.05ns ± 4% -14.88% (p=0.000 n=37+37)
BM_EraseIf/num_elements:10/num_erased:5 4.40ns ± 3% 4.22ns ± 3% -4.19% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:500 9.16ns ± 4% 9.13ns ± 3% ~ (p=0.307 n=37+37)
BM_EraseIf/num_elements:10/num_erased:10 5.77ns ± 3% 5.50ns ± 4% -4.62% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:1000 7.84ns ± 3% 7.77ns ± 3% -0.94% (p=0.006 n=37+35)
name old time/op new time/op delta
BM_EraseIf/num_elements:10/num_erased:0 3.02ns ± 2% 2.79ns ± 3% -7.48% (p=0.000 n=35+36)
BM_EraseIf/num_elements:1000/num_erased:0 2.41ns ± 5% 2.05ns ± 4% -14.89% (p=0.000 n=37+37)
BM_EraseIf/num_elements:10/num_erased:5 4.42ns ± 3% 4.23ns ± 3% -4.22% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:500 9.18ns ± 4% 9.15ns ± 3% ~ (p=0.347 n=37+37)
BM_EraseIf/num_elements:10/num_erased:10 5.79ns ± 3% 5.52ns ± 4% -4.61% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:1000 7.87ns ± 3% 7.79ns ± 3% -0.95% (p=0.007 n=37+35)
name old INSTRUCTIONS/op new INSTRUCTIONS/op delta
BM_EraseIf/num_elements:10/num_erased:0 14.9 ± 0% 12.9 ± 0% -13.46% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:0 12.7 ± 0% 10.3 ± 0% -18.76% (p=0.000 n=37+37)
BM_EraseIf/num_elements:10/num_erased:5 30.9 ± 0% 28.9 ± 0% -6.48% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:500 37.6 ± 0% 35.3 ± 0% -6.33% (p=0.000 n=37+37)
BM_EraseIf/num_elements:10/num_erased:10 46.9 ± 0% 44.9 ± 0% -4.27% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:1000 62.6 ± 0% 60.2 ± 0% -3.80% (p=0.000 n=37+36)
name old CYCLES/op new CYCLES/op delta
BM_EraseIf/num_elements:10/num_erased:0 4.91 ± 1% 4.11 ± 1% -16.35% (p=0.000 n=36+35)
BM_EraseIf/num_elements:1000/num_erased:0 7.74 ± 2% 6.54 ± 2% -15.54% (p=0.000 n=37+37)
BM_EraseIf/num_elements:10/num_erased:5 9.18 ± 3% 8.45 ± 3% -7.88% (p=0.000 n=37+35)
BM_EraseIf/num_elements:1000/num_erased:500 29.5 ± 1% 29.3 ± 1% -0.82% (p=0.000 n=36+37)
BM_EraseIf/num_elements:10/num_erased:10 13.5 ± 1% 12.6 ± 0% -7.06% (p=0.000 n=33+34)
BM_EraseIf/num_elements:1000/num_erased:1000 25.1 ± 0% 24.9 ± 0% -0.90% (p=0.000 n=37+35)
```
PiperOrigin-RevId: 642318040
Change-Id: I78a4a5a9a5881db0818225f9c7c153c562009f66
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`EraseIf` itself is not very hot function, but I want to use that as demonstration of the speed of iteration via `IterateOverFullSlots`.
Motivation:
1. We are still going to save some resources.
2. It is the first step to implement faster `absl::c_for_each` that may give larger benefits. Will require readability considerations.
`BM_EraseIf/num_elements:1000/num_erased:0` is just iteration and it gives 60% speed up.
On smaller tables removing all elements shows 25% speed up. Note that on small tables erasing is much faster due to cl/592272653.
```
name old cpu/op new cpu/op delta
BM_EraseIf/num_elements:10/num_erased:0 3.41ns ± 5% 3.03ns ± 3% -11.14% (p=0.000 n=37+35)
BM_EraseIf/num_elements:1000/num_erased:0 6.06ns ± 3% 2.42ns ± 3% -60.05% (p=0.000 n=34+37)
BM_EraseIf/num_elements:10/num_erased:5 5.90ns ± 3% 4.44ns ± 4% -24.88% (p=0.000 n=36+37)
BM_EraseIf/num_elements:1000/num_erased:500 11.0ns ± 2% 9.2ns ± 2% -16.60% (p=0.000 n=35+37)
BM_EraseIf/num_elements:10/num_erased:10 8.03ns ± 3% 5.77ns ± 2% -28.19% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:1000 9.00ns ± 3% 7.83ns ± 2% -12.98% (p=0.000 n=37+37)
name old time/op new time/op delta
BM_EraseIf/num_elements:10/num_erased:0 3.42ns ± 5% 3.04ns ± 3% -11.13% (p=0.000 n=37+36)
BM_EraseIf/num_elements:1000/num_erased:0 6.07ns ± 3% 2.42ns ± 3% -60.10% (p=0.000 n=34+37)
BM_EraseIf/num_elements:10/num_erased:5 5.93ns ± 3% 4.45ns ± 4% -24.89% (p=0.000 n=36+37)
BM_EraseIf/num_elements:1000/num_erased:500 11.1ns ± 2% 9.2ns ± 2% -16.61% (p=0.000 n=35+37)
BM_EraseIf/num_elements:10/num_erased:10 8.06ns ± 3% 5.79ns ± 2% -28.19% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:1000 9.03ns ± 3% 7.85ns ± 2% -12.98% (p=0.000 n=37+37)
name old INSTRUCTIONS/op new INSTRUCTIONS/op delta
BM_EraseIf/num_elements:10/num_erased:0 19.5 ± 1% 14.9 ± 0% -23.79% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:0 19.9 ± 0% 12.7 ± 0% -36.20% (p=0.000 n=37+37)
BM_EraseIf/num_elements:10/num_erased:5 36.9 ± 1% 30.9 ± 0% -16.47% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:500 44.8 ± 0% 37.6 ± 0% -16.06% (p=0.000 n=37+37)
BM_EraseIf/num_elements:10/num_erased:10 53.0 ± 1% 46.9 ± 0% -11.61% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:1000 69.8 ± 0% 62.6 ± 0% -10.32% (p=0.000 n=36+37)
name old CYCLES/op new CYCLES/op delta
BM_EraseIf/num_elements:10/num_erased:0 6.10 ± 7% 4.91 ± 1% -19.49% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:0 19.4 ± 1% 7.7 ± 2% -60.04% (p=0.000 n=37+37)
BM_EraseIf/num_elements:10/num_erased:5 13.9 ± 4% 9.2 ± 3% -33.80% (p=0.000 n=37+37)
BM_EraseIf/num_elements:1000/num_erased:500 35.5 ± 0% 29.5 ± 1% -16.74% (p=0.000 n=37+37)
BM_EraseIf/num_elements:10/num_erased:10 20.8 ± 5% 13.5 ± 0% -35.07% (p=0.000 n=37+30)
BM_EraseIf/num_elements:1000/num_erased:1000 28.9 ± 0% 25.1 ± 0% -13.06% (p=0.000 n=37+37)
```
PiperOrigin-RevId: 642016364
Change-Id: I8be6af5916bd45fd110bb0398c3ffe932a6a083f
|
|
|
|
|
|
|
|
|
| |
k2) -> hash(k1) == hash(k2)`.
Also add missing includes/dependencies in flat_hash_map_test.
PiperOrigin-RevId: 640959222
Change-Id: I8d99544af05e97310045e6149f6ef6f7c82e552d
|
|
|
|
|
| |
PiperOrigin-RevId: 640620462
Change-Id: I72c0a7f0f549404ad8310b8cebd7dd491341390b
|
|
|
|
|
| |
PiperOrigin-RevId: 640359514
Change-Id: Ic321d23cad283425c686536ae7be04a490a950d5
|
|
|
|
|
| |
PiperOrigin-RevId: 640208455
Change-Id: I4c2b7d3f1adad2078e8a5f805574f71c4d7743d4
|
|
|
|
|
|
|
|
|
| |
for "some" standard containers.
If you use this idiom with `std::vector` or `absl::btree_map` you can end up either skipping elements or dereferencing an invalid iterator.
PiperOrigin-RevId: 639892758
Change-Id: Ic5c213667b4b1e8c39813ee237aaffe320a8eb27
|
|
|
|
|
| |
PiperOrigin-RevId: 636218177
Change-Id: I9f58ccbb468fcc0c44ef12162415f7b721a745bf
|
|
|
|
|
|
|
|
|
| |
This significantly reduces binary size of big binaries and creates a single hot function instead of many cold. That is decreasing cash misses during code execution.
We also avoid size related computations for tables with no deleted slots, when resize is necessary.
PiperOrigin-RevId: 635527119
Change-Id: I763b135f1f6089051e62e348a07b33536af265ab
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL contains two optimizations that were measured together.
1) InsertMiss (i. e. successful insert) optimization:
The idea that in case there is no kDeleted, we already know 99% of the information. So we are finding the position to insert with 2 asm instructions (or 3 in case of ARM or portable) and passing that as a hint into `prepare_insert`.
`prepare_insert` is out of the line in order to minimize effect on InsertHit (the most important case). `prepare_insert` may use the hint in case we still have growth and no kDeleted is guaranteed.
In case of kDeleted, we still call `find_first_non_full` in order to potentially find kDeleted slot earlier. We may consider different ways to do it faster for kDeleted later.
2) `find_first_non_full` optimization:
After optimization #1 `find_first_non_full` is used:
1. during resize and copy
2. right after resize
3. during DropDeletedWithoutResize
3. in InsertMiss for tables with kDeleted
In cases 1-3 the table is quite sparse.
1. After resize it is 7/16 sparse
2. During resize it is 7/16 maximum, but during first inserts it is much sparser.
3. During copy it may be up to 7/8, but at the beginning it is way sparser.
4. During DropDeletedWithoutResize it is 25/32 sparse, but at the beginning it is way sparser.
The only case, where the table is not known to be sparse, with `find_first_non_full` usage is a table with deleted elements. But according to hashz, we have <3% such tables. Adding an extra branch wouldn't hurt much there.
PiperOrigin-RevId: 619681885
Change-Id: Id3e2852cc6d85f6c8f90982d7aeb14799696bf39
|
|
|
|
|
| |
PiperOrigin-RevId: 619649335
Change-Id: I8b3380816418a363fb6686db7966248cb530c491
|
|
|
|
|
|
|
| |
The motivation is to use presence of kDeleted slots for optimizing InsertMiss for tables without kDeleted slots.
PiperOrigin-RevId: 619411282
Change-Id: Icc30606374aba7ce60b41f93ee8d44894e1b8aa5
|
|
|
|
|
|
| |
Motivation: there are new uninitialized memory warnings when we enable small object optimization.
PiperOrigin-RevId: 619295212
Change-Id: If10762bab0e43c9619fc03f6d1eef5b8836bbf9a
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Often code needs to know that it's built with sanitizers.
There are two common use for such information:
1. Work around incompatibility with sanitizers
2. Use sanitizers for more aggressive bug detection
With the current `ABSL_HAVE_*_SANITIZER` we can't distinguish
this two cased, and we didn't need that before.
Now user can define `NDEBUG_SANITIZER` to ask code like this
to avoid unnecessary checks.
I am not 100% sure that `NDEBUG` is not enough.
However relying on `NDEBUG` today will relax many tests, which
runs in NDEBUG mode only. So new `NDEBUG_SANITIZER` is safer
approach.
PiperOrigin-RevId: 619268413
Change-Id: I58185cd6886593a3742b8424deccdec366c2a35a
|
|
|
|
|
|
|
|
| |
This can identify situations where flat_hash_* is suboptimal for large
elements.
PiperOrigin-RevId: 618937993
Change-Id: I2bde069bc3526b14ad1718ba6f50467002aeed16
|
|
|
|
|
|
|
|
|
| |
When sampling triggers, we skip SOO and allocate a backing array. We must do this because the HashtablezInfoHandle is part of the heap allocation (along with the control bytes and slots). By default, we sample 1 in ~1024 hashtables when sampling is enabled. This will impact performance because (a) we won't benefit from SOO so we would have worse data locality (more cache/TLB misses), and (b) the backing array capacity will be 3 instead of 1 so (b.1) we skip the rehash after the second insertion and (b.2) we potentially waste up to two slots worth of memory.
We also add an soo_capacity field to HashtablezInfo to allow for distinguishing which sampled tables may otherwise have been SOO - this will allow us to know approximately what fraction of tables are in SOO mode.
PiperOrigin-RevId: 617252334
Change-Id: Ib48b7a4870bd74ea3ba923ed8f350a3b75dbb7d3
|
|
|
|
|
|
|
| |
increases in shared libraries.
PiperOrigin-RevId: 615497725
Change-Id: Ic29db8923ea4ea7cd0b01b396896fa9fff8c74b0
|
|
|
|
|
|
|
|
| |
Motivation: mitigate linker input size increase from swisstable optimizations.
Note: the changes in raw_hash_set.h are fixing build errors that happened when adding the explicit instantiations. The change in unchecked_deref is because set iterators have const reference access whereas map iterators have mutable reference access and the function is never actually called for sets (it's used in raw_hash_map) so it wasn't needed before. I'm not sure why the soo_slot/soo_iterator problems didn't cause compile errors earlier.
PiperOrigin-RevId: 615174043
Change-Id: Iac5eb2332a76e9b70021156fbb2b8def47a5391d
|
|
|
|
|
| |
PiperOrigin-RevId: 615131303
Change-Id: I68fcbdd943594983c67f8e07810b05d5fa9a6f2e
|
|
|
|
|
| |
PiperOrigin-RevId: 614701769
Change-Id: I7c2143dd467e376eb4936ef894f3413bba681419
|
|
|
|
|
| |
PiperOrigin-RevId: 613590317
Change-Id: I69f095681102e5492916085ada0eed085a75765b
|
|
|
|
|
| |
PiperOrigin-RevId: 613305668
Change-Id: Ifc247f48ea476745eaaf0dd41dbdab8404a6cafb
|
|
|
|
|
|
|
|
|
| |
Details:
- We use the space for control/slots pointers as the inline buffer.
- We use a max inline capacity of 1 to make the implementation much simpler and to avoid having to randomize the iteration order for inline tables.
- For iteration of inline tables, we introduce the kSooControl buffer which just has 1 full control byte followed by 1 sentinel control byte so that incrementing yields an end() iterator. We don't access kSooControl during lookups - only iteration.
PiperOrigin-RevId: 613253492
Change-Id: Id98ff11842f8bef27ac7ed88138dc03b46ce4fa6
|
|
|
|
|
| |
PiperOrigin-RevId: 612560213
Change-Id: Id75dfd1222a0bed8ec72ce21e4a97b1d09fc9eaa
|
|
|
|
|
|
|
| |
Instead of casting an int to the enum type where the int does not have an associated enum value, cast the enum to its underlying type. This should be no functional change but make some linters happier.
PiperOrigin-RevId: 611172311
Change-Id: I9ae10f8fa2029014236f60a90ee2ab2273c66fa5
|
|
|
|
|
|
|
| |
unnecessary probing before resize that is beneficial for small tables the most.
PiperOrigin-RevId: 609547787
Change-Id: If6584919b4c93945ea078b1c1a9f57b355dce924
|
|
|
|
|
|
|
| |
return type of insert.
PiperOrigin-RevId: 609058024
Change-Id: I2f7cc2daf862e7e2d23acd6dd3fe85cb1945d5f0
|
|
|
|
|
|
|
|
| |
raw_hash_set::find(key).
Motivation: the implementation becomes more complicated with small object optimization.
PiperOrigin-RevId: 608742838
Change-Id: I55fc42321b1967f9c7bbee49817a2f2d4ee44b56
|
|
|
|
|
|
|
|
| |
It can be used instead of `MaskEmptyOrDeleted` in case of inserting to empty table.
`MaskNonFull` requires less operations, in particular it eliminates `_mm_set1_epi8` and `_mm_cmpgt_epi8` operations.
PiperOrigin-RevId: 607587394
Change-Id: Ia48f922d1ca6de38cc91e7ab0d608c45f8f2c446
|
|
|
|
|
| |
PiperOrigin-RevId: 605460827
Change-Id: I57007a7ad18829e7bfed27ba65871afbd227d012
|
|
|
|
|
|
|
| |
`IterateOverFullSlots` for iterating for all tables.
PiperOrigin-RevId: 605116090
Change-Id: Ia65c664421f7630225b00f1c45c636b4681121ce
|
|
|
|
|
|
|
|
|
|
| |
There are three optimizations here:
1. Early exit in case all slots were destroyed. especially useful for empty tables.
2. MatchFull is used in order to iterate over all full slots.
3. Portable group is used for `MatchFull` in the case of small table.
PiperOrigin-RevId: 603434899
Change-Id: I40bc90d17331d579cfbb1b4e0693f0913e5c38e4
|
|
|
|
|
| |
PiperOrigin-RevId: 603148301
Change-Id: Ie2e5702995c9e1ef4d5aaab23bc89a1eb5007a86
|
|
|
|
|
| |
PiperOrigin-RevId: 602974812
Change-Id: Ic35b41e321b9456a8ddd83470ee2eb07c51e3180
|
|
|
|
|
|
|
| |
flat_hash_{*}.
PiperOrigin-RevId: 602813933
Change-Id: I744fe438281755a141b2fd47e54ab9c6c0fad5a3
|
|
|
|
|
| |
PiperOrigin-RevId: 602485199
Change-Id: I5cc10eb8dcfe5988cf939080a224522e02ad8607
|
|
|
|
|
|
|
| |
dereference of iterator returned by `try_emplace`.
PiperOrigin-RevId: 597920257
Change-Id: I1b2e8f10a2f1efa763a6f0760294beafdb6fd9c0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
with ABSL_HAVE_HWADDRESS_SANITIZER.
It will detect bugs similar to Asan.
Also updated related tests to pass with HWASAN.
They are still flaky because of nature of HWASAN algorithm,
but test can be update to avoid flakiness, which I will do
in followup patches.
PiperOrigin-RevId: 597613798
Change-Id: Ic8af36a268ca041c002eb561b946aa2d9b93996a
|
|
|
|
|
|
|
| |
removing the iterator invalidation check from the comparison that contains performs.
PiperOrigin-RevId: 595460301
Change-Id: I9a5d6c81385e38184f4848c58209adc5d32bb7be
|
|
|
|
|
| |
PiperOrigin-RevId: 592272653
Change-Id: I895c5786555227bdc88ab0a4cce8cf5ba65222a1
|
|
|
|
|
|
|
|
|
| |
It is not used at the moment. Usage is planned to be submitted separately.
Useful for faster iterating over all slots in the internal functions.
PiperOrigin-RevId: 590300049
Change-Id: I081f33113268761db868771d29796d94c24e4e7a
|
|
|
|
|
|
|
|
|
| |
Details:
- In case the table entirely fits into a single group size (`capacity <= Group::kWidth`), the order of elements is not important.
- For growing upto Group::kWidth we rotate control bytes and slots deterministically (using memcpy).
- We also avoid second find_first_non_full right after resize for small growing.
PiperOrigin-RevId: 588825966
Change-Id: I09bd7fd489e3868dcf56c36b436805d08dae7ab5
|
|
|
|
|
|
|
| |
disabled. This will make it easier to roll back and forwards in the future (if needed) without causing merge conflicts in unrelated code.
PiperOrigin-RevId: 582059046
Change-Id: I66dc6527e7a0b351367b7a391c2d653fe793143f
|
|
|
|
|
| |
PiperOrigin-RevId: 580726428
Change-Id: I12b0f22c2084aef90bcca67536220a6bb550b57e
|
|
|
|
|
|
|
|
|
| |
This allows for avoiding e.g. it.inner_.slot_ on const iterators.
Also, refactor HashtableDebugAccess::AllocatedByteSize to use regular iteration instead of looking directly at control/slot_array of the table in order to support small object optimization.
PiperOrigin-RevId: 580194253
Change-Id: I64cd69287834ee5c7a8daf867c532258806bfb7b
|
|
|
|
|
|
|
| |
that may have been invalidated by a container move.
PiperOrigin-RevId: 578649798
Change-Id: Icfee98d3a0399b545ec6ec59c5b52ae5e006218b
|