perf(weave): env-gate calls_merged heavy-LIKE ngram path + add attributes_dump index by gtarpenning · Pull Request #6891 · wandb/weave

gtarpenning · 2026-05-19T21:13:30Z

Builds on #6695 (the heavy-LIKE candidate-CTE wiring) and on the parameter sweep spec at ~/Desktop/specs/attributes-dump-ngram-index.md.

Summary

Adds the missing fourth heavy-dump index in migration 031: idx_attributes_dump_ngram ifNull(attributes_dump, '') TYPE ngrambf_v1(5, 65536, 3, 0) GRANULARITY 8. Other three columns (inputs/output/summary) already covered by perf(weave): wrap heavy-field LIKE filters in ifNull() and add ngram index #6695.
Gates the bloom-eligible code path behind WF_CALLS_MERGED_HEAVY_INDEXES. When unset the LIKE optimization falls back to the pre-bloom shape (<col> LIKE ? plus outer OR-IS-NULL, no candidate CTE), so the migration is safe to apply ahead of the flag and operators can flip it per-cluster after the index materializes on enough parts to be useful.
Two gating points share one env helper:
- optimization_builder._create_like_condition emits ifNull(<col>, '') only when the flag is on; otherwise emits raw <table>.<col>.
- calls_query_builder._build_filter_candidate_ids_cte_sql returns None when the flag is off, dropping the candidate-CTE entirely.

Performance

Cold-from-disk numbers vs no-index baseline, measured on a realistic 2M-row LLM-shaped corpus (18.9 GiB compressed attributes_dump, 8 GiB cgroup so the column does NOT fit in OS page cache). Bench harness at core/.../scripts/skip_index_bench/ (genv3 + verify_winners_v3); REPORT in the same dir.

Cell	Index size	NEEDLE	RARE	DENSE	cheap
baseline	-	17.66s / 46.7 GB	16.32s / 46.7 GB	16.09s / 46.7 GB	19ms
`ngrambf_v1(5, 65536, 3, 0) GRANULARITY 8` (shipped here)	39 MiB	2.91s / 7.5 GB (6.0x)	7.92s / 23.7 GB (2.1x)	14.42s / 45.8 GB (1.12x)	30ms

Projection to a representative prod shard (~2 TiB compressed attributes_dump): ~4 GiB index per column, ~17 GiB across all four heavy dumps. Default mark_cache_size 5 GiB; tune up if needed.

Testing

tests/trace_server/query_builder/conftest.py defaults WF_CALLS_MERGED_HEAVY_INDEXES=true for query-builder tests so the SQL assertions pin the post-flag shape.
New test_calls_query_with_like_optimization_env_off exercises the off-path: no candidate CTE, raw <col> LIKE ?, OR-IS-NULL retained.
nox --no-install -e "tests-3.12(shard='trace_server')" -- tests/trace_server/query_builder/ green locally (442 passed).

Rollout

Land + run migration 031 (no behavior change, indexes just exist).
Wait for natural-merge backfill OR run MATERIALIZE INDEX idx_<col>_ngram per column on hot ranges.
Set WF_CALLS_MERGED_HEAVY_INDEXES=true per cluster. EXPLAIN INDEXES on a known-bad customer query to confirm Granules: X/Y with X << Y on the candidate-CTE inner.
Watch p99 on heavy-LIKE fingerprints (expect ~6x drop on NEEDLE shape) and on cheap-key fingerprints (expect +4ms warm p50, well under the 50ms kill criterion).
If anything regresses, flip the flag off — query SQL reverts immediately, no rollback of the migration needed.

Spec: Desktop/specs/attributes-dump-ngram-index.md §1.3 (Phase-1 realistic-corpus bench), §4.5 (Unified candidate-CTE composition with #6764).

…ED_HEAVY_INDEXES Adds attributes_dump to migration 031 (so all four heavy dumps have the ngrambf_v1(5, 65536, 3, 0) GRANULARITY 8 skip index) and gates the bloom-eligible candidate-CTE wiring on a new env flag so the new path can roll out per-cluster without changing query SQL until the operator opts in. The migration is idempotent and safe to apply ahead of the flag; only the query builder behavior is flag-controlled. With the flag off the LIKE optimization emits `<col> LIKE ?` (matching pre-bloom behavior) and `_build_filter_candidate_ids_cte_sql` returns None. With the flag on the LIKE emits `ifNull(<col>, '') LIKE ?` so the index expression matches and the candidate CTE pre-narrows via the bloom. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-05-19T21:23:56Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…py; gate strict-form computation Two follow-ups from review of #6891: 1. The autouse fixture defaulting WF_CALLS_MERGED_HEAVY_INDEXES=true moves from tests/trace_server/query_builder/conftest.py to tests/trace_server/conftest.py. The narrower scope missed test_filter_candidate_cte_consistency_with_unmerged_parts in test_calls_complete.py, which would have silently fallen into the env-off (no-CTE) path. With the fixture hoisted, every trace_server test exercises the bloom-eligible candidate-CTE path uniformly. 2. process_query_to_optimization_sql now gates the strict-form computation on the same env flag as _build_filter_candidate_ids_cte_sql. When the flag is off the CTE builder returns None before consuming strict_result_sql, so computing it was wasted work. The two gates sit in different files; keeping them on the same flag means future readers reach the right conclusion without tracing the data flow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

wandbot-3000 · 2026-05-19T21:35:29Z

Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=95e5ec239a20863bc5261274b41a56b9aa4e9e0d

Per review feedback. Inline comments stay <= 2 lines; rationale belongs in the PR description. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chore: collapse multi-line inline comment in optimization_builder

272a73d

Per review feedback. Inline comments stay <= 2 lines; rationale belongs in the PR description. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(weave): env-gate calls_merged heavy-LIKE ngram path + add attributes_dump index#6891

perf(weave): env-gate calls_merged heavy-LIKE ngram path + add attributes_dump index#6891
gtarpenning wants to merge 3 commits into
gtarpenning/inputs-ngram-ifnullfrom
gtarpenning/calls-merged-heavy-indexes-env-gated

gtarpenning commented May 19, 2026

Uh oh!

codecov Bot commented May 19, 2026 •

edited

Loading

Uh oh!

wandbot-3000 Bot commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gtarpenning commented May 19, 2026

Summary

Performance

Testing

Rollout

Uh oh!

codecov Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

wandbot-3000 Bot commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented May 19, 2026 •

edited

Loading