perf: optimize rebase overhead 1614ms → 883ms (1.8x faster) by svarlamov · Pull Request #854 · git-ai-project/git-ai

svarlamov · 2026-03-29T03:46:53Z

Summary

Optimizes the rebase authorship rewrite with two major changes:

Hunk-based attribution reconstruction — replaces expensive content-based matching (cat-file per commit) with replay of diff-tree hunks to shift line attributions forward through commit history
Combined diff-tree subprocess — merges the two separate git diff-tree calls (new commits + original commits) into a single --stdin invocation, eliminating ~500ms of process startup overhead

Key optimizations

Hunk-based attribution reconstruction (attribution_reconstruction: 906ms → 6ms, 150x):
Instead of reading file contents at every original commit via cat-file and doing content-based matching, replay diff-tree hunks forward to shift accumulated attributions. Each commit's note only covers lines changed in that commit, so we overlay note ranges onto the running attribution state.
Combined diff-tree call (diff_tree: 594ms + 587ms → single call):
Both new-commit and original-commit diff-tree calls use identical flags (--stdin --raw -p -U0). Concatenate both commit lists into one stdin pipe, partition results afterward.
Per-commit metrics recomputation: Fixed bug where prompt_line_metrics were frozen before the commit loop, causing all rebased commits to share identical accepted_lines/overridden_lines values. Now recomputed per commit.
Delete-then-recreate bug fixes: Clear seen_files, files_with_synced_state, current_file_contents, and current_attributions when a file is deleted, preventing stale data from corrupting attributions on re-creation.

Benchmark results (realistic monorepo benchmark — 2000 files, 8 AI-tracked, 30 feature onto 500 main)

Metric	Standard `git rebase`	Graphite-style (commit-tree + update-ref)
Total wall clock	0.933s	1.445s
git-ai internal	165ms	235ms
Per feature commit	31.1ms	48.1ms
Notes rewritten	531→561 (30 new)	531→561 (30 new)

Phase timing comparison

Phase	Standard	Graphite	Delta
load_rebase_note_cache	10ms	68ms	+58ms (history walk)
diff_tree_combined	34ms	24ms	-10ms
attribution_reconstruction	4ms	4ms	—
commit_processing_loop	51ms	51ms	—
notes_add_batch	51ms	77ms	+26ms
TOTAL	165ms	235ms	+70ms (+42%)

The Graphite path's overhead is entirely in load_rebase_note_cache (history walking to reconstruct commit mappings — standard rebase gets these for free from git's rewritten-list file). The core processing loop is identical.

Test results

3064+ integration tests: all pass
New test_graphite_style_multi_commit_single_update_ref — verifies N notes remapped from single update-ref
New test_rebase_file_delete_recreate_after_hunk_modification — exercises create→hunk-modify→delete→recreate sequence
Graphite-style benchmark matching actual gt sync behavior (commit-tree replay + atomic update-ref)

🤖 Generated with Claude Code

devin-ai-integration

Devin Review found 1 new potential issue.

View 11 additional findings in Devin Review.

devin-ai-integration · 2026-03-29T18:17:19Z

+                let tclone = std::time::Instant::now();
                current_attributions.insert(file_path.clone(), (Vec::new(), line_attrs));
-                current_file_contents.insert(file_path.clone(), new_content.clone());
+                if !use_hunk_based && let Some(content) = new_content {
+                    current_file_contents.insert(file_path.clone(), content.clone());
+                }


🔴 Stale current_file_contents after hunk-based transfer causes mismatched attribution on delete-then-recreate

When the hunk-based fast path is used, current_attributions is updated (line numbers shifted) but current_file_contents is NOT updated (src/authorship/rebase_authorship.rs:1452 only updates for the non-hunk path). If a file is subsequently deleted and recreated, files_with_synced_state is cleared (src/authorship/rebase_authorship.rs:1391), forcing the slow content-diff path on recreation. The slow path at src/authorship/rebase_authorship.rs:1429-1435 passes current_file_contents.get(file_path) (stale content from before hunk transfers) together with current_attributions.get(file_path) (line numbers adjusted by hunk transfers) into compute_line_attrs_for_changed_file → diff_based_line_attribution_transfer. This function builds old_line_author by indexing into the stale content using the shifted line numbers, producing corrupted attribution mappings.

Trigger sequence (4+ commits in a rebase)

File A first appears in commit N → content-diff (slow path) → current_file_contents["A"] = v1, current_attributions["A"] = attrs for v1

File A modified in commit N+1 → hunk-based transfer → current_attributions["A"] adjusted to v2 line numbers, but current_file_contents["A"] still = v1

File A deleted in commit N+2 → files_with_synced_state.remove("A"), but attrs and contents kept

File A recreated in commit N+3 → slow path uses stale v1 content with v2-shifted attrs → incorrect diff produces wrong attribution

Was this helpful? React with 👍 or 👎 to provide feedback.

Fixed in commit b2a5d4c. The deletion handler now clears both current_file_contents and current_attributions for the deleted file, ensuring recreation uses a clean content-diff instead of stale data. Added regression test test_rebase_file_delete_recreate_after_hunk_modification that exercises the exact 4-commit trigger sequence (create → hunk-modify → delete → recreate).

Replace per-file content diffing (imara_diff Myers) with hunk-based attribution transfer using git diff-tree -p -U0 hunk headers. For commits after first appearance, adjusts attribution line numbers via O(attrs+hunks) arithmetic instead of O(file_length) Myers diff. Key changes: - Combined diff-tree call extracts both metadata and hunks in one subprocess - Only read blob contents for first-seen files (99% reduction in blob I/O) - Hunk-based transfer for all subsequent commits (no diff algorithm needed) Benchmark (100 commits x 30 files x 200 lines): - Authorship overhead: 1450ms -> 413ms (3.5x faster) - Commit processing loop: 1247ms -> 28ms (44x faster) - Blob reads: 3000 -> 30 (99% reduction) - 3056 integration tests pass, 0 regressions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ed var) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ching The attribution_reconstruction phase was the dominant bottleneck (51% of total authorship overhead) in realistic rebases. Root cause: reading file contents for ALL pathspecs × ALL commits via cat-file, even when most commits' notes only reference a few files. Key changes: - Selective cat-file batch: only read files referenced in each commit's note attestations, not all pathspecs for every commit (~12x fewer reads in the realistic benchmark) - HEAD note uses already-loaded HEAD file contents (no extra git call) - Always perform full content-based scan across ALL commits' notes (no HEAD-only shortcut — each note only covers its commit's changes) - Avoid unnecessary subtract+add metrics cycle for deleted/skipped files - Use literal "human" comparison instead of allocating via to_str() Realistic benchmark (150 commits × 50 files, 10-2000+ lines): - attribution_reconstruction: 821ms → 192ms (4.3x faster) - Total overhead: 1617ms → 1008ms (1.6x faster) Adds 3 regression tests proving non-HEAD attribution survives rebase: - test_rebase_preserves_attribution_from_non_head_commits - test_rebase_preserves_multi_commit_attribution_same_file - test_rebase_non_head_attribution_survives_slow_path All 79 rebase + 21 cherry-pick tests pass with zero regressions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…arser Addresses Devin review findings: 1. Rename/copy entries (R/C status) in --raw format without -z produce "old_path\tnew_path" — use the destination path for attribution tracking. 2. First-appearance blob optimization now only marks files as "seen" when they have a non-None blob OID. Deletions no longer prevent later re-creations from having their blob content read. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The metadata template is frozen before the commit loop, so the per-commit subtract→add→apply_to_prompts cycle was computing metrics that never appeared in the output. Remove the dead computation to save ~120ms in the commit processing loop. Also fix timing instrumentation: loop:serialize and loop:metrics were using as_millis() per-iteration (always 0), now use as_micros() for accurate sub-millisecond accumulation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The shutil.rmtree call on the benchmark repo directory intermittently fails with "Directory not empty: 'objects'" on CI runners. This is a cleanup step after all benchmarks have already completed successfully, so using ignore_errors=True prevents spurious CI failures. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Previous run had Windows jobs stuck in pending for hours due to GitHub Actions runner availability issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Two bugs found by Devin review: 1. first_appearance_blobs: seen_files was not cleared on file deletion, so when a file was deleted and recreated with different content, the new blob OID was never read (seen_files.insert returned false). 2. files_with_synced_state: was not cleared on file deletion, so recreation incorrectly used the hunk-based fast path with stale pre-deletion attributions instead of content-diff. Adds regression test for delete-then-recreate with different content. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Instead of spawning separate git diff-tree processes for new commits and original commits, concatenate both commit lists into a single --stdin call and partition the results afterward. This eliminates ~500ms of subprocess overhead (git process startup + repo loading) on large rebases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds benchmark_monorepo_rebase modeling a realistic scenario: 30 feature commits rebased onto 500 main commits in a 2000-file monorepo, all 100% AI-authored (worst case). 25% of main commits touch the same AI files. Supports MONO_BENCH_CACHE_DIR env var to cache the test repo setup across runs (~2min setup saved to disk, restored in <1s). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The metadata template was frozen before the commit processing loop, so every rebased commit got the same prompt metrics (accepted_lines, overridden_lines) from the initial state. Now metrics are recomputed after each commit's attribution changes (hunk transfer or content diff). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…create - Fix stale current_file_contents after hunk-based transfer: clear cached file contents and attributions on deletion so recreation uses clean content-diff instead of stale data (addresses Devin review feedback) - Add Graphite-style rebase benchmark matching actual gt sync behavior: all commits replayed via commit-tree, then ONE atomic update-ref - Add test_graphite_style_multi_commit_single_update_ref verifying N notes are remapped from a single update-ref (commit_tree_update_ref.rs) - Add test_rebase_file_delete_recreate_after_hunk_modification exercising the create→hunk-modify→delete→recreate sequence (rebase.rs) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

svarlamov changed the base branch from hunk-opt to main March 29, 2026 03:50

This comment was marked as resolved.

Sign in to view

svarlamov force-pushed the perf/rebase-reconstruction-opt branch from 2b2ff56 to fc6d3f8 Compare March 29, 2026 04:29

svarlamov changed the title ~~perf: optimize attribution_reconstruction 821ms → 6ms (137x)~~ perf: optimize attribution_reconstruction 821ms → 192ms (4.3x) Mar 29, 2026

This comment was marked as resolved.

Sign in to view

svarlamov changed the title ~~perf: optimize attribution_reconstruction 821ms → 192ms (4.3x)~~ perf: optimize rebase overhead 1614ms → 883ms (1.8x faster) Mar 29, 2026

devin-ai-integration bot reviewed Mar 29, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

svarlamov mentioned this pull request Mar 29, 2026

perf: stream notes to fast-import during commit loop #856

Closed

svarlamov and others added 16 commits March 29, 2026 22:44

style: cargo fmt

f69fafa

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: remove scratchpad from tracking

814d672

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: clippy lint warnings (useless format, collapsible if)

f30b912

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: remaining clippy lints (unnecessary casts, collapsible ifs, unus…

262f8aa

…ed var) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: rustfmt formatting

19cb964

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

style: cargo fmt

ffac2e2

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ci: trigger fresh workflow run for Windows tests

e55bcab

Previous run had Windows jobs stuck in pending for hours due to GitHub Actions runner availability issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: resolve clippy warnings (collapsible-if, manual-flatten)

5c88e51

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

svarlamov force-pushed the perf/rebase-reconstruction-opt branch from 0f72020 to fb6091f Compare March 29, 2026 22:45

svarlamov and others added 2 commits March 29, 2026 23:42

svarlamov merged commit 1f24c91 into main Mar 30, 2026
21 checks passed

svarlamov deleted the perf/rebase-reconstruction-opt branch March 30, 2026 03:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimize rebase overhead 1614ms → 883ms (1.8x faster)#854

perf: optimize rebase overhead 1614ms → 883ms (1.8x faster)#854
svarlamov merged 18 commits intomainfrom
perf/rebase-reconstruction-opt

svarlamov commented Mar 29, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Mar 29, 2026 •

edited

Loading

Uh oh!

svarlamov Mar 30, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

svarlamov commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key optimizations

Benchmark results (realistic monorepo benchmark — 2000 files, 8 AI-tracked, 30 feature onto 500 main)

Phase timing comparison

Test results

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

svarlamov Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

svarlamov commented Mar 29, 2026 •

edited

Loading

devin-ai-integration bot Mar 29, 2026 •

edited

Loading