Skip to content

Comments

fix: evicted transactions could reappear after a node restart#20773

Merged
mrzeszutko merged 2 commits intomerge-train/spartanfrom
mrzeszutko/fix-tx-come-back-to-life
Feb 24, 2026
Merged

fix: evicted transactions could reappear after a node restart#20773
mrzeszutko merged 2 commits intomerge-train/spartanfrom
mrzeszutko/fix-tx-come-back-to-life

Conversation

@mrzeszutko
Copy link
Contributor

Summary

  • Fix a bug in TxPoolV2.addPendingTxs where evicted transactions could reappear after a node restart ("come back to life")
  • Move all throwable I/O (buildTxMetaData, getMinedBlockId, validateMeta) out of the transactionAsync callback into a pre-computation phase, so that if any I/O fails, no in-memory or DB mutations have occurred

Background

addPendingTxs processes a batch of transactions inside a single LMDB transactionAsync. When tx N causes nullifier-conflict evictions of earlier pool txs and then tx N+1 triggers a throw (e.g. getTxEffect I/O failure or validator crash), the LMDB transaction rolls back — but in-memory mutations from the eviction persist. On restart the pool rehydrates from DB where the soft-delete marker was never committed, and the evicted tx loads back into the pool.

Other pool methods (prepareForSlot, handlePrunedBlocks, handleFinalizedBlock, etc.) are not affected because they either perform throwable I/O before any deletions, or wrap throwable operations in try/catch via the EvictionManager.

Approach (TDD)

The fix was developed test-first. Two failing tests were written that reproduce the inconsistency — one for each throwable path (getMinedBlockId and validateMeta) — by verifying that getTxStatus returns the same value before and after a pool restart. With the bug present, the status diverges ('deleted' in memory vs 'pending' after restart). The implementation was then applied to make both tests pass.

Implementation

addPendingTxs is now split into two phases:

  1. Pre-computation (outside the transaction): For each tx, compute buildTxMetaData, getMinedBlockId, and validateMeta. If any of these throw, the call fails before any mutations happen.
  2. Transaction (inside transactionAsync): Uses only pre-computed results, in-memory index reads, and buffered DB writes.

Supporting changes:

  • #addTx accepts an optional precomputedMeta parameter (backward-compatible — other call sites unchanged)
  • #tryAddRegularPendingTx receives pre-computed metadata directly instead of calling buildTxMetaData/validateMeta; validation rejection is handled by the caller

Test plan

  • Two new persistence consistency tests pass (were failing before the fix)
  • All 210 tx_pool_v2.test.ts tests pass
  • Full p2p test suite passes (1292 tests)

Copy link
Collaborator

@PhilWindle PhilWindle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

@mrzeszutko mrzeszutko merged commit 0da134d into merge-train/spartan Feb 24, 2026
10 checks passed
@mrzeszutko mrzeszutko deleted the mrzeszutko/fix-tx-come-back-to-life branch February 24, 2026 09:09
AztecBot pushed a commit that referenced this pull request Feb 24, 2026
## Summary

- Fix a bug in `TxPoolV2.addPendingTxs` where evicted transactions could reappear after a node restart ("come back to life")
- Move all throwable I/O (`buildTxMetaData`, `getMinedBlockId`, `validateMeta`) out of the `transactionAsync` callback into a pre-computation phase, so that if any I/O fails, no in-memory or DB mutations have occurred

## Background

`addPendingTxs` processes a batch of transactions inside a single LMDB `transactionAsync`. When tx N causes nullifier-conflict evictions of earlier pool txs and then tx N+1 triggers a throw (e.g. `getTxEffect` I/O failure or validator crash), the LMDB transaction rolls back — but in-memory mutations from the eviction persist. On restart the pool rehydrates from DB where the soft-delete marker was never committed, and the evicted tx loads back into the pool.

Other pool methods (`prepareForSlot`, `handlePrunedBlocks`, `handleFinalizedBlock`, etc.) are not affected because they either perform throwable I/O *before* any deletions, or wrap throwable operations in try/catch via the EvictionManager.

## Approach (TDD)

The fix was developed test-first. Two failing tests were written that reproduce the inconsistency — one for each throwable path (`getMinedBlockId` and `validateMeta`) — by verifying that `getTxStatus` returns the same value before and after a pool restart. With the bug present, the status diverges (`'deleted'` in memory vs `'pending'` after restart). The implementation was then applied to make both tests pass.

## Implementation

`addPendingTxs` is now split into two phases:

1. **Pre-computation** (outside the transaction): For each tx, compute `buildTxMetaData`, `getMinedBlockId`, and `validateMeta`. If any of these throw, the call fails before any mutations happen.
2. **Transaction** (inside `transactionAsync`): Uses only pre-computed results, in-memory index reads, and buffered DB writes.

Supporting changes:
- `#addTx` accepts an optional `precomputedMeta` parameter (backward-compatible — other call sites unchanged)
- `#tryAddRegularPendingTx` receives pre-computed metadata directly instead of calling `buildTxMetaData`/`validateMeta`; validation rejection is handled by the caller

## Test plan

- [x] Two new persistence consistency tests pass (were failing before the fix)
- [x] All 210 `tx_pool_v2.test.ts` tests pass
- [x] Full p2p test suite passes (1292 tests)
@AztecBot
Copy link
Collaborator

✅ Successfully backported to backport-to-v4-staging #20752.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants