wip: Optimize BitBuffer methods across the board#7375
Open
joseph-isaacs wants to merge 4 commits intodevelopfrom
Open
wip: Optimize BitBuffer methods across the board#7375joseph-isaacs wants to merge 4 commits intodevelopfrom
joseph-isaacs wants to merge 4 commits intodevelopfrom
Conversation
Key optimizations: - `Not for &BitBuffer`: allocate directly instead of clone+try_into_mut which always failed (clone shares Arc, so 2 refs = failure) → 27-58% faster bitwise NOT on references - `iter_bits()`: process u64 chunks instead of byte-by-byte → 13% faster iteration - `PartialEq`: fast path using memcmp for byte-aligned buffers - `append_buffer()`: memcpy fast path when both sides are byte-aligned → 20-52% faster buffer appends - `append_false()`: remove unnecessary branch (new bytes are zero-init) → 65% faster single-bit appends - `from_indices()`: use set_bit_unchecked directly on the byte slice - `FromIterator` tail: batch remaining items in u64 words → 13% faster from_iter - `sliced()`: use bitwise_unary_op_copy to avoid clone+fail path, fix byte range bug in aligned path - `filter_bitbuffer_by_indices`: detect contiguous runs for bulk copy - `filter_bitbuffer_by_slices`: use slice()+append_buffer() instead of per-bit get+append - Add #[inline] to hot methods: set, set_to, append, true_count, etc. Signed-off-by: Claude <noreply@anthropic.com> https://claude.ai/code/session_0163XH8LLYAkU2qNQGbmYjhB
Merging this PR will improve performance by ×2.4
Performance Changes
Comparing Footnotes
|
BitBuffers almost always have shared backing storage (from slicing, array construction, etc.), so try_into_mut nearly always fails. The owned `Not for BitBuffer` now uses the same direct-copy path as the reference version, and the dead `bitwise_unary_op` function is removed. For true in-place mutation, use `BitBufferMut` directly. Signed-off-by: Claude <noreply@anthropic.com> https://claude.ai/code/session_0163XH8LLYAkU2qNQGbmYjhB
Keep the in-place mutation fast path for owned BitBuffer when the backing storage has exclusive access (Arc refcount == 1). When try_into_mut fails (shared storage), delegate to bitwise_unary_op_copy instead of duplicating the copy logic. Signed-off-by: Claude <noreply@anthropic.com> https://claude.ai/code/session_0163XH8LLYAkU2qNQGbmYjhB
The scan loop pattern `mask = mask.bitand(&conjunct_mask)` previously always allocated a new buffer because all binary ops took references. Now owned-left operands try try_into_mut for zero-allocation in-place mutation when the backing storage has exclusive access. Changes: - Add bitwise_binary_op_lhs_owned in ops.rs: tries in-place on owned left, falls back to allocating bitwise_binary_op - Wire BitAnd/BitOr/BitXor owned-left impls to use in-place path - Add BitBuffer::into_bitand_not for owned variant - Add BitAnd<&Mask>/BitOr<&Mask> for Mask: extracts owned BitBuffer for in-place binary ops in the scan loop - Update Mask::bitand_not to use owned BitBuffer path - Fix flat/zoned readers to capture density before consuming mask Signed-off-by: Claude <noreply@anthropic.com> https://claude.ai/code/session_0163XH8LLYAkU2qNQGbmYjhB
robert3005
reviewed
Apr 9, 2026
| /// Tries `try_into_mut` on the left operand. If the backing storage has exclusive access, | ||
| /// the operation is performed in-place (zero allocation). Otherwise, falls back to | ||
| /// [`bitwise_binary_op`] which allocates a new buffer. | ||
| pub(super) fn bitwise_binary_op_lhs_owned<F: FnMut(u64, u64) -> u64>( |
Contributor
There was a problem hiding this comment.
maybe this should just be a method on the bitbuffer? Not sure why I made these separate functions
robert3005
reviewed
Apr 9, 2026
| let dst_bit_offset = start_bit_pos % 8; | ||
| let src_bit_offset = buffer.offset(); | ||
|
|
||
| if dst_bit_offset == 0 && src_bit_offset == 0 { |
Contributor
There was a problem hiding this comment.
I don't think src offset matters here, you care that src end is byte aligned
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Key optimizations:
Not for &BitBuffer: allocate directly instead of clone+try_into_mutwhich always failed (clone shares Arc, so 2 refs = failure)
→ 27-58% faster bitwise NOT on references
iter_bits(): process u64 chunks instead of byte-by-byte→ 13% faster iteration
PartialEq: fast path using memcmp for byte-aligned buffersappend_buffer(): memcpy fast path when both sides are byte-aligned→ 20-52% faster buffer appends
append_false(): remove unnecessary branch (new bytes are zero-init)→ 65% faster single-bit appends
from_indices(): use set_bit_unchecked directly on the byte sliceFromIteratortail: batch remaining items in u64 words→ 13% faster from_iter
sliced(): use bitwise_unary_op_copy to avoid clone+fail path,fix byte range bug in aligned path
filter_bitbuffer_by_indices: detect contiguous runs for bulk copyfilter_bitbuffer_by_slices: use slice()+append_buffer() insteadof per-bit get+append
Signed-off-by: Claude noreply@anthropic.com
https://claude.ai/code/session_0163XH8LLYAkU2qNQGbmYjhB