Skip to content

test(but-core): demonstrate data loss in safe_checkout_from_head deterministically#13090

Open
krlvi wants to merge 1 commit intomasterfrom
fix/concurrent-commit-data-loss
Open

test(but-core): demonstrate data loss in safe_checkout_from_head deterministically#13090
krlvi wants to merge 1 commit intomasterfrom
fix/concurrent-commit-data-loss

Conversation

@krlvi
Copy link
Copy Markdown
Member

@krlvi krlvi commented Mar 29, 2026

Add a single-threaded unit test that reproduces the data-loss bug without
any concurrency. The key insight is that safe_checkout_from_head reads
repo.head_tree_id_or_empty() at call time. If a concurrent process has
already updated HEAD (H0 → H_A), the next caller computes the diff H_A → H_B
where H_B was derived from the stale H0. That diff marks A's file as DELETED
and the checkout removes it from disk.

The test sets up two workspace commits derived from the same parent H0:
H_A = H0 + auth/session.ts
H_B = H0 + api/pagination.ts (no session.ts — stale derivation)

  1. safe_checkout_from_head(H_A) → HEAD = H_A, session.ts created ✓
  2. safe_checkout_from_head(H_B) → reads HEAD = H_A, diff H_A→H_B marks
    session.ts as DELETED → physically removes it ← DATA LOSS

The test does not propagate errors immediately on the second checkout;
instead it checks whether session.ts still exists on disk and only then
emits the detailed failure message. This ensures the data-loss assertion
fires even if the checkout also returns an error.

The test is #[ignore] because it currently FAILS (demonstrating the bug).
Remove the ignore and flip the assertion once the fix lands.

…rministically

Add a single-threaded unit test that reproduces the data-loss bug without
any concurrency.  The key insight is that safe_checkout_from_head reads
repo.head_tree_id_or_empty() *at call time*.  If a concurrent process has
already updated HEAD (H0 → H_A), the next caller computes the diff H_A → H_B
where H_B was derived from the stale H0.  That diff marks A's file as DELETED
and the checkout removes it from disk.

The test sets up two workspace commits derived from the same parent H0:
  H_A = H0 + auth/session.ts
  H_B = H0 + api/pagination.ts   (no session.ts — stale derivation)

1. safe_checkout_from_head(H_A)  →  HEAD = H_A, session.ts created ✓
2. safe_checkout_from_head(H_B)  →  reads HEAD = H_A, diff H_A→H_B marks
   session.ts as DELETED → physically removes it  ← DATA LOSS

The test does not propagate errors immediately on the second checkout;
instead it checks whether session.ts still exists on disk and only then
emits the detailed failure message.  This ensures the data-loss assertion
fires even if the checkout also returns an error.

The test is #[ignore] because it currently FAILS (demonstrating the bug).
Remove the ignore and flip the assertion once the fix lands.
Copilot AI review requested due to automatic review settings March 29, 2026 02:04
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
gitbutler-web Ignored Ignored Preview Mar 29, 2026 2:05am

Request Review

@github-actions github-actions bot added the rust Pull requests that update Rust code label Mar 29, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an (ignored) regression test in but-core to deterministically reproduce and document a data-loss scenario in safe_checkout_from_head when two workspace trees are derived from the same stale base HEAD.

Changes:

  • Import safe_checkout_from_head into the existing worktree checkout test suite.
  • Add a new single-threaded #[ignore] unit test that demonstrates how a stale-derived checkout can delete files created by a prior checkout.

Comment on lines +892 to +893
/// deletion). Remove the `#[ignore]` annotation once the bug is fixed and flip the
/// final assertion to `assert!(session_path.exists(), …)`.
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment says to “flip the final assertion to assert!(session_path.exists(), …) once the bug is fixed”, but the test doesn’t currently have a final assertion to flip (it relies on the match arms/panic behavior). Consider rewording this to match the actual intended post-fix behavior (e.g., whether the second checkout must succeed, or only must not delete the file).

Suggested change
/// deletion). Remove the `#[ignore]` annotation once the bug is fixed and flip the
/// final assertion to `assert!(session_path.exists(), …)`.
/// deletion). Once the bug is fixed, remove the `#[ignore]` annotation and update the
/// final expectations to require that the second checkout succeeds and that
/// `session_path` still exists on disk (for example, by asserting `session_path.exists()`).

Copilot uses AI. Check for mistakes.

// Classify what happened and emit the most informative failure message possible.
match (checkout_b_result, session_deleted) {
(_, true) => {
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the data-loss case you match (_, true) and then panic, which discards any error returned by the second safe_checkout_from_head call. Splitting this into (Err(e), true) vs (Ok(_), true) (or including the error in the panic message when present) would make failures much easier to diagnose while still prioritizing the on-disk data-loss assertion.

Suggested change
(_, true) => {
(Err(e), true) => {
// Data was lost and the checkout call also returned an error — include both in the failure.
panic!(
"DATA LOSS BUG: auth/session.ts was deleted from disk by \
safe_checkout_from_head(H_B).\n\n\
Cause: H_B was computed from stale H(before process A committed), so it \
does not contain auth/session.ts. However safe_checkout_from_head reads the \
CURRENT HEAD (= H_A, which does contain auth/session.ts) as the source tree. \
The resulting diff H_AH_B marks auth/session.ts as DELETED and the \
checkout physically removes the file.\n\n\
Additionally, safe_checkout_from_head(H_B) returned an error:\n\
{e:?}\n\n\
Fix: serialise workspace checkouts with inter-process locking \
(try_exclusive_inter_process_access) or capture the current HEAD before the \
workspace is recomputed and pass it explicitly to safe_checkout.",
e
);
}
(Ok(_), true) => {

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krlvi I took a look and don't think that the PR should not be merged, for two reasons:

  • it doesn't show a race, it shows conflicting instructions. Checkout A then checkout B, and it does as it's told. Safe checkout is safe because it won't nuke unstaged or untracked files, but it's still allowed to set the target tree to match the given commit.
  • safe_checkout is a plumbing function which expects to have a repository lock on the caller side. Once these are inter-process safe, safe-checkout will not easily be interleaved with conflicting calls.

Something that safe_checkout could do on top is to immediately grab a lock on the .git/index to prevent other checkouts, for instance by Git itself, from happening. This is definitely a possible improvement and one that I think should be done.

For example, right now, it would be possible for one but commit to wait for a commit-message, and a Git commit to come in in the meantime that will happily be overwritten once the commit message clears.

Will see to following this up.

@krlvi
Copy link
Copy Markdown
Member Author

krlvi commented Mar 29, 2026

@krlvi I took a look and don't think that the PR should not be merged, for two reasons:

* it doesn't show a race, it shows conflicting instructions. Checkout A then checkout B, and it does as it's told. Safe checkout is safe because it won't nuke unstaged or untracked files, but it's still allowed to set the target tree to match the given commit.

* safe_checkout is a plumbing function which expects to have a repository lock on the caller side. Once these are inter-process safe, safe-checkout will not easily be interleaved with conflicting calls.

Something that safe_checkout could do on top is to immediately grab a lock on the .git/index to prevent other checkouts, for instance by Git itself, from happening. This is definitely a possible improvement and one that I think should be done.

For example, right now, it would be possible for one but commit to wait for a commit-message, and a Git commit to come in in the meantime that will happily be overwritten once the commit message clears.

Will see to following this up.

I see, this makes sense. So do i understand correctly that we mainly wanna address this at a "higher level"/app logic, like this test targets #13079 ?

@Byron
Copy link
Copy Markdown
Collaborator

Byron commented Apr 2, 2026

@krlvi Yes, the higher level would be the way.
Sorry for the late response, still catching up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants