Skip to content

Add optional query bit size hint to KnnSearchStrategy.Hnsw#15708

Open
arup-chauhan wants to merge 3 commits intoapache:mainfrom
arup-chauhan:query-bit-hint-strategy
Open

Add optional query bit size hint to KnnSearchStrategy.Hnsw#15708
arup-chauhan wants to merge 3 commits intoapache:mainfrom
arup-chauhan:query-bit-hint-strategy

Conversation

@arup-chauhan
Copy link
Copy Markdown

This PR introduces an optional query bit-size hint for KnnSearchStrategy.Hnsw as a first incremental step toward #15614.

I intentionally limited the scope to search-strategy API plumbing and tests to keep risk low.
Default behavior remains unchanged: the new hint is metadata-only in this PR and does not alter scoring yet.

Changes

  • Added optional queryBitSizeHint to KnnSearchStrategy.Hnsw.
  • Kept backward compatibility by preserving the existing constructor and adding an overload:
    • Hnsw(int filteredSearchThreshold)
    • Hnsw(int filteredSearchThreshold, Integer queryBitSizeHint)
  • Added validation for the hint (> 0 when non-null).
  • Added getter: queryBitSizeHint().
  • Updated equals / hashCode for Hnsw to include the new hint.
  • Updated KnnSearchStrategy.Patience constructors to support hint passthrough.
  • Updated HnswQueueSaturationCollector#getSearchStrategy() to preserve and forward the hint when wrapping strategies.
  • Added new tests in TestKnnSearchStrategy to cover:
    • default/no-hint behavior
    • constructor with hint
    • hint validation
    • equals/hashCode behavior
    • hint preservation across seeded/patience wrapping paths

Validation

  • ./gradlew -p lucene/core compileJava
  • ./gradlew -p lucene/core test --tests TestKnnSearchStrategy
  • ./gradlew -p lucene/core check

Only tweak I made: removed “(after formatting fix)” since it passes now.

Signed-off-by: Arup Chauhan <arupchauhan.connect@gmail.com>
@arup-chauhan
Copy link
Copy Markdown
Author

Hey @benwtrent

Implemented the first incremental step by adding an optional queryBitSizeHint to KnnSearchStrategy.Hnsw.

I’ve preserved backward compatibility, forwarded the hint through Patience / seeded wrapping paths, and added focused strategy tests. This PR intentionally does not change scoring behavior.

Next, I plan to wire the hint into the vector reader/scorer plumbing, and then follow up with a small PR that uses the hint for query-bit-aware behavior in the scalar-quantized scoring path.

Looking forward to your feedback!

Signed-off-by: Arup Chauhan <arupchauhan.connect@gmail.com>
Copy link
Copy Markdown
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what @mccullocht thinks? I know when we were discussing the new scalar formats, we stuck to a static query bits for doc vectors for simplicity.

I am on the fence on the complexity this potentially adds for users. Basically, my concern is that certain datasets work very well with just single bit or 2 bit queries vs. single bit quantized vectors. This gives 2x or 4x better vector ops throughput with almost zero change in recall.

I also think we want the ability to "refine" the query scores by increasing the bits on a subset of vectors. Which hopefully users don't have to directly access.

Comment thread lucene/CHANGES.txt Outdated
Signed-off-by: Arup Chauhan <arupchauhan.connect@gmail.com>
@github-actions github-actions bot added this to the 10.5.0 milestone Feb 19, 2026
@mccullocht
Copy link
Copy Markdown
Contributor

This API is probably fine for the described purpose but I'm skeptical about how useful this will be. Recall improvements diminish pretty quickly when increasing the query bit rate without increasing the doc bit rate. I'm optimistic that we could do more to improve recall and performance without exposing this kind of parameter.

To obey the proposed API we would need to be able to compare two vectors of different bit rates for any pair of bit rates up to, say, 8 bits/dim. Up to somewhere around 4-8 comparisons/dimension the transpose + popcount strategy that we employ for bit and dibit works, but once the number of comparisons grows larger than that it starts to become cheaper to perform a dot product, and how well that will work depends a lot on how the vectors are packed. The current 1-bit packing scheme in particular would be difficult to compare to other bit rate vectors because of how hard it would be to unpack into the same dimension order as something else. This problem also exists if you look at extending the doc vector with quantized residual as described in the LVQ paper.

I have another idea that is inspired by placing statistical bounds on estimated distance as described in the RaBitQ paper -- the idea is that if a minSimilarity parameter was passed to score() the scorer might be able to eliminate certain candidates after examining only 1 bit of a 4 bit query vector. I'll file an issue for this once I have a better handle on the math.

@arup-chauhan
Copy link
Copy Markdown
Author

arup-chauhan commented Feb 23, 2026

Hey @mccullocht @benwtrent,

Thanks, this is super helpful context.

I agree that recall gains from increasing query bits alone may taper quickly, and that cross-bit-rate comparisons can get expensive/packing-dependent.

In this PR, I only added metadata/API plumbing (no scoring behavior change yet), but your points are exactly the risks for follow-up use.

I’m happy to keep this scoped as incremental plumbing and treat any query-bit-aware scoring work as a follow-up, backed by evidence (benchmarks, recall, and complexity tradeoffs), potentially behind internal strategy decisions so users don’t need to tune low-level parameters directly.

The minSimilarity-based early-elimination idea sounds very promising. Looking forward to the issue.

@github-actions
Copy link
Copy Markdown
Contributor

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Mar 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants