Skip to content

perf: optimize rebase overhead 1614ms → 883ms (1.8x faster)#854

Merged
svarlamov merged 18 commits intomainfrom
perf/rebase-reconstruction-opt
Mar 30, 2026
Merged

perf: optimize rebase overhead 1614ms → 883ms (1.8x faster)#854
svarlamov merged 18 commits intomainfrom
perf/rebase-reconstruction-opt

Conversation

@svarlamov
Copy link
Copy Markdown
Member

@svarlamov svarlamov commented Mar 29, 2026

Summary

Optimizes the rebase authorship rewrite with two major changes:

  1. Hunk-based attribution reconstruction — replaces expensive content-based matching (cat-file per commit) with replay of diff-tree hunks to shift line attributions forward through commit history
  2. Combined diff-tree subprocess — merges the two separate git diff-tree calls (new commits + original commits) into a single --stdin invocation, eliminating ~500ms of process startup overhead

Key optimizations

  • Hunk-based attribution reconstruction (attribution_reconstruction: 906ms → 6ms, 150x):
    Instead of reading file contents at every original commit via cat-file and doing content-based matching, replay diff-tree hunks forward to shift accumulated attributions. Each commit's note only covers lines changed in that commit, so we overlay note ranges onto the running attribution state.

  • Combined diff-tree call (diff_tree: 594ms + 587ms → single call):
    Both new-commit and original-commit diff-tree calls use identical flags (--stdin --raw -p -U0). Concatenate both commit lists into one stdin pipe, partition results afterward.

  • Per-commit metrics recomputation: Fixed bug where prompt_line_metrics were frozen before the commit loop, causing all rebased commits to share identical accepted_lines/overridden_lines values. Now recomputed per commit.

  • Delete-then-recreate bug fixes: Clear seen_files, files_with_synced_state, current_file_contents, and current_attributions when a file is deleted, preventing stale data from corrupting attributions on re-creation.

Benchmark results (realistic monorepo benchmark — 2000 files, 8 AI-tracked, 30 feature onto 500 main)

Metric Standard git rebase Graphite-style (commit-tree + update-ref)
Total wall clock 0.933s 1.445s
git-ai internal 165ms 235ms
Per feature commit 31.1ms 48.1ms
Notes rewritten 531→561 (30 new) 531→561 (30 new)

Phase timing comparison

Phase Standard Graphite Delta
load_rebase_note_cache 10ms 68ms +58ms (history walk)
diff_tree_combined 34ms 24ms -10ms
attribution_reconstruction 4ms 4ms
commit_processing_loop 51ms 51ms
notes_add_batch 51ms 77ms +26ms
TOTAL 165ms 235ms +70ms (+42%)

The Graphite path's overhead is entirely in load_rebase_note_cache (history walking to reconstruct commit mappings — standard rebase gets these for free from git's rewritten-list file). The core processing loop is identical.

Test results

  • 3064+ integration tests: all pass
  • New test_graphite_style_multi_commit_single_update_ref — verifies N notes remapped from single update-ref
  • New test_rebase_file_delete_recreate_after_hunk_modification — exercises create→hunk-modify→delete→recreate sequence
  • Graphite-style benchmark matching actual gt sync behavior (commit-tree replay + atomic update-ref)

🤖 Generated with Claude Code

@svarlamov svarlamov changed the base branch from hunk-opt to main March 29, 2026 03:50
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

@svarlamov svarlamov force-pushed the perf/rebase-reconstruction-opt branch from 2b2ff56 to fc6d3f8 Compare March 29, 2026 04:29
@svarlamov svarlamov changed the title perf: optimize attribution_reconstruction 821ms → 6ms (137x) perf: optimize attribution_reconstruction 821ms → 192ms (4.3x) Mar 29, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

@svarlamov svarlamov changed the title perf: optimize attribution_reconstruction 821ms → 192ms (4.3x) perf: optimize rebase overhead 1614ms → 883ms (1.8x faster) Mar 29, 2026
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 11 additional findings in Devin Review.

Open in Devin Review

Comment on lines +1450 to +1454
let tclone = std::time::Instant::now();
current_attributions.insert(file_path.clone(), (Vec::new(), line_attrs));
current_file_contents.insert(file_path.clone(), new_content.clone());
if !use_hunk_based && let Some(content) = new_content {
current_file_contents.insert(file_path.clone(), content.clone());
}
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Stale current_file_contents after hunk-based transfer causes mismatched attribution on delete-then-recreate

When the hunk-based fast path is used, current_attributions is updated (line numbers shifted) but current_file_contents is NOT updated (src/authorship/rebase_authorship.rs:1452 only updates for the non-hunk path). If a file is subsequently deleted and recreated, files_with_synced_state is cleared (src/authorship/rebase_authorship.rs:1391), forcing the slow content-diff path on recreation. The slow path at src/authorship/rebase_authorship.rs:1429-1435 passes current_file_contents.get(file_path) (stale content from before hunk transfers) together with current_attributions.get(file_path) (line numbers adjusted by hunk transfers) into compute_line_attrs_for_changed_filediff_based_line_attribution_transfer. This function builds old_line_author by indexing into the stale content using the shifted line numbers, producing corrupted attribution mappings.

Trigger sequence (4+ commits in a rebase)
  1. File A first appears in commit N → content-diff (slow path) → current_file_contents["A"] = v1, current_attributions["A"] = attrs for v1
  2. File A modified in commit N+1 → hunk-based transfer → current_attributions["A"] adjusted to v2 line numbers, but current_file_contents["A"] still = v1
  3. File A deleted in commit N+2 → files_with_synced_state.remove("A"), but attrs and contents kept
  4. File A recreated in commit N+3 → slow path uses stale v1 content with v2-shifted attrs → incorrect diff produces wrong attribution
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit b2a5d4c. The deletion handler now clears both current_file_contents and current_attributions for the deleted file, ensuring recreation uses a clean content-diff instead of stale data. Added regression test test_rebase_file_delete_recreate_after_hunk_modification that exercises the exact 4-commit trigger sequence (create → hunk-modify → delete → recreate).

devin-ai-integration[bot]

This comment was marked as resolved.

svarlamov and others added 16 commits March 29, 2026 22:44
Replace per-file content diffing (imara_diff Myers) with hunk-based
attribution transfer using git diff-tree -p -U0 hunk headers. For
commits after first appearance, adjusts attribution line numbers via
O(attrs+hunks) arithmetic instead of O(file_length) Myers diff.

Key changes:
- Combined diff-tree call extracts both metadata and hunks in one subprocess
- Only read blob contents for first-seen files (99% reduction in blob I/O)
- Hunk-based transfer for all subsequent commits (no diff algorithm needed)

Benchmark (100 commits x 30 files x 200 lines):
- Authorship overhead: 1450ms -> 413ms (3.5x faster)
- Commit processing loop: 1247ms -> 28ms (44x faster)
- Blob reads: 3000 -> 30 (99% reduction)
- 3056 integration tests pass, 0 regressions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ed var)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ching

The attribution_reconstruction phase was the dominant bottleneck (51% of
total authorship overhead) in realistic rebases. Root cause: reading
file contents for ALL pathspecs × ALL commits via cat-file, even when
most commits' notes only reference a few files.

Key changes:
- Selective cat-file batch: only read files referenced in each commit's
  note attestations, not all pathspecs for every commit (~12x fewer
  reads in the realistic benchmark)
- HEAD note uses already-loaded HEAD file contents (no extra git call)
- Always perform full content-based scan across ALL commits' notes
  (no HEAD-only shortcut — each note only covers its commit's changes)
- Avoid unnecessary subtract+add metrics cycle for deleted/skipped files
- Use literal "human" comparison instead of allocating via to_str()

Realistic benchmark (150 commits × 50 files, 10-2000+ lines):
- attribution_reconstruction: 821ms → 192ms (4.3x faster)
- Total overhead: 1617ms → 1008ms (1.6x faster)

Adds 3 regression tests proving non-HEAD attribution survives rebase:
- test_rebase_preserves_attribution_from_non_head_commits
- test_rebase_preserves_multi_commit_attribution_same_file
- test_rebase_non_head_attribution_survives_slow_path

All 79 rebase + 21 cherry-pick tests pass with zero regressions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…arser

Addresses Devin review findings:

1. Rename/copy entries (R/C status) in --raw format without -z produce
   "old_path\tnew_path" — use the destination path for attribution tracking.

2. First-appearance blob optimization now only marks files as "seen" when
   they have a non-None blob OID. Deletions no longer prevent later
   re-creations from having their blob content read.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The metadata template is frozen before the commit loop, so the
per-commit subtract→add→apply_to_prompts cycle was computing
metrics that never appeared in the output. Remove the dead
computation to save ~120ms in the commit processing loop.

Also fix timing instrumentation: loop:serialize and loop:metrics
were using as_millis() per-iteration (always 0), now use
as_micros() for accurate sub-millisecond accumulation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The shutil.rmtree call on the benchmark repo directory intermittently
fails with "Directory not empty: 'objects'" on CI runners. This is a
cleanup step after all benchmarks have already completed successfully,
so using ignore_errors=True prevents spurious CI failures.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previous run had Windows jobs stuck in pending for hours due to
GitHub Actions runner availability issues.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two bugs found by Devin review:

1. first_appearance_blobs: seen_files was not cleared on file deletion,
   so when a file was deleted and recreated with different content, the
   new blob OID was never read (seen_files.insert returned false).

2. files_with_synced_state: was not cleared on file deletion, so
   recreation incorrectly used the hunk-based fast path with stale
   pre-deletion attributions instead of content-diff.

Adds regression test for delete-then-recreate with different content.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of spawning separate git diff-tree processes for new commits and
original commits, concatenate both commit lists into a single --stdin
call and partition the results afterward. This eliminates ~500ms of
subprocess overhead (git process startup + repo loading) on large rebases.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds benchmark_monorepo_rebase modeling a realistic scenario: 30 feature
commits rebased onto 500 main commits in a 2000-file monorepo, all 100%
AI-authored (worst case). 25% of main commits touch the same AI files.

Supports MONO_BENCH_CACHE_DIR env var to cache the test repo setup
across runs (~2min setup saved to disk, restored in <1s).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@svarlamov svarlamov force-pushed the perf/rebase-reconstruction-opt branch from 0f72020 to fb6091f Compare March 29, 2026 22:45
svarlamov and others added 2 commits March 29, 2026 23:42
The metadata template was frozen before the commit processing loop,
so every rebased commit got the same prompt metrics (accepted_lines,
overridden_lines) from the initial state. Now metrics are recomputed
after each commit's attribution changes (hunk transfer or content diff).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…create

- Fix stale current_file_contents after hunk-based transfer: clear cached
  file contents and attributions on deletion so recreation uses clean
  content-diff instead of stale data (addresses Devin review feedback)
- Add Graphite-style rebase benchmark matching actual gt sync behavior:
  all commits replayed via commit-tree, then ONE atomic update-ref
- Add test_graphite_style_multi_commit_single_update_ref verifying N notes
  are remapped from a single update-ref (commit_tree_update_ref.rs)
- Add test_rebase_file_delete_recreate_after_hunk_modification exercising
  the create→hunk-modify→delete→recreate sequence (rebase.rs)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@svarlamov svarlamov merged commit 1f24c91 into main Mar 30, 2026
21 checks passed
@svarlamov svarlamov deleted the perf/rebase-reconstruction-opt branch March 30, 2026 03:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant