Conversation
…lysis - 00-codebase-mapping.md: maps existing PoRBFT v2 to Petri concepts - 01-implementation-plan.md: 8-phase implementation plan - 02-risks-and-considerations.md: risks, mitigations, open questions - petri.md: pitch document in markdown - Epic #9 created in mycelium with 37 atomic tasks - Team Mode activated and persisted to AGENTS.md
- Update codebase mapping (v2) with post-stabilisation file paths - Rewrite implementation plan (v2) with corrected paths and finalized design decisions - Replace open questions in risks doc with finalized decisions table - Add VADEMECUM operational guide (team mode, testing, guardrails) - Update mycelium tasks with test phases and dependency links
New phase for exposing soft finality to SDK consumers. Marked as requiring user coordination before touching ../sdks/.
- Create src/libs/consensus/petri/ directory structure - Define TransactionClassification enum (PRE_APPROVED, TO_APPROVE, PROBLEMATIC) - Define StateDelta, PeerDelta interfaces - Define ContinuousForgeRound, ForgeConfig, ForgeState interfaces - Define PetriConfig with defaults (2s forge, 10s block, 7/10 threshold, 5-round TTL) - Define DeltaComparison, RoundDeltaResult interfaces - Add petriConsensus feature flag (default: false) to SharedState - Create index.ts entry point stub with type re-exports - Update VADEMECUM with diagram agent requirement and Phase 8 tests
- Create TransactionClassifier: classify txs as PRE_APPROVED (read-only) or TO_APPROVE (state-changing) based on GCR edit generation - Create SpeculativeExecutor: simulate GCR edits without mutating state, produce deterministic delta hash via canonical JSON + SHA-256 - Create canonicalJson utility: deterministic serialization with sorted keys, BigInt handling, Map/Set support - Extend MempoolTx entity with classification and delta_hash columns - Add classification queries to Mempool: getByClassification(), getPreApproved(), updateClassification() - Wire classifier into endpointValidation.ts (gated by petriConsensus flag) - Add 28 unit tests: canonicalJson determinism, classifier logic, delta hash determinism — all passing - Add Phase 0 architecture diagram
Add Phase 1 data flow, new module boxes (classifier, speculativeExecutor, canonicalJson), feature flag gate visualization, and modified module entries (Mempool entity, mempool_v2). Phase 0 content preserved.
… validation Moved Petri classification from endpointValidation (where tx doesn't exist in mempool yet) to Mempool.addTransaction() (where the row is created). Classification and delta_hash are now saved atomically with the mempool entry. Also removed unused IsNull/Not imports from mempool_v2.
- Create DeltaAgreementTracker: tracks per-tx delta agreement across forge rounds, promotes at 7/10 threshold, flags after 5-round TTL - Create ContinuousForge: 2s forge loop with mempool sync, speculative execution, all-to-all delta exchange, agreement evaluation, and classification updates (TO_APPROVE → PRE_APPROVED or PROBLEMATIC) - Create forgeInstance singleton for RPC handler access - Add petri_exchangeDeltas RPC handler to manageConsensusRoutines (gated by petriConsensus flag) - Wire petriConsensusRoutine() in index.ts to create and start forge - Add 22 new tests: DeltaAgreementTracker (15 tests covering promotion, flagging, TTL, mixed scenarios, boundary cases) + ContinuousForge state lifecycle (7 tests) - Total: 50 tests passing across 5 files
Add Continuous Forge cyclic flow diagram, DeltaAgreementTracker detail, forgeInstance/RPC bridge, and complete round data flow. All P0/P1/P2 components shown with phase annotations.
- PetriBlockCompiler: compiles PRE_APPROVED txs into blocks at 10s boundary using existing orderTransactions() + createBlock() infrastructure - BFTArbitrator: one BFT round per PROBLEMATIC tx — resolve or reject (chain never stalls, rejection is fail-safe) - PetriBlockFinalizer: broadcastBlockHash() + isBlockValid() threshold + insertBlock() + BroadcastManager propagation - petriConsensusRoutine() full lifecycle: forge → pause → arbitrate → compile → finalize → reset → resume - Consensus dispatch switching in mainLoop.ts and manageConsensusRoutines.ts (petriConsensus flag gates which routine is called) - 21 new tests (BFT threshold, result structures, lifecycle, dispatch) - All 71 Petri tests passing
- ShardMapper: getShardForAddress() — single-shard testnet (always 'default'), interface ready for multi-shard expansion - PetriRouter: selectMembers() uses Alea PRNG seeded with tx hash for deterministic routing to exactly 2 shard members - PetriRouter.relay(): sends ValidityData to selected members via existing nodeCall/RELAY_TX RPC pattern (same as DTR but targeted) - endpointExecution.ts: when petriConsensus flag is on, routes via PetriRouter.relay() instead of DTR (early return before validator check) - 16 new tests (ShardMapper, selectMembers determinism/count/uniqueness, routing flag gating, response shapes) - All 87 Petri tests passing
- BFTArbitrator: document roundNumber: -1 sentinel reuse of petri_exchangeDeltas - endpointExecution: flag Petri routing early-return mempool flow for P6 verification
- Add soft_finality_at column to MempoolTx and Transactions entities - Set soft_finality_at in updateClassification() on PRE_APPROVED - Add getTransactionFinality() function (chain-first, then mempool) - Add getTransactionFinality RPC endpoint in rpcDispatch - Re-export Phase 5 components from petri/index.ts - Add finality.test.ts (15 tests covering structure, transitions, RPC shape)
Only set soft_finality_at on first PRE_APPROVED classification, not on subsequent updates. Prevents timestamp overwrite if a tx is re-classified.
- happyPath.test.ts: full lifecycle (classify → agree → compile → finalize) - conflictPath.test.ts: double-spend → PROBLEMATIC → BFT resolution/rejection - byzantineFault.test.ts: Byzantine minority tolerance (f < n/3) - liveness.test.ts: chain never stalls (empty blocks, bounded PROBLEMATIC TTL) - featureFlagRollback.test.ts: clean ON/OFF/ON toggle with state isolation - benchmark.test.ts: DeltaTracker throughput (5K txs), routing (10K calls), BFT O(1) - Add getPetriForgeInstance() getter to forgeInstance.ts 84 new tests (186 total), 0 failures
- Add @deprecated JSDoc to SecretaryManager class (1018 lines) - Add @deprecated to Secretary RPC handlers (setValidatorPhase, greenlight, getValidatorPhase, getBlockTimestamp) in manageConsensusRoutines.ts - Add @deprecated to OmniProtocol consensus handlers (opcodes 0x35-0x38) - Add @deprecated to ValidationPhase types in validationStatusTypes.ts - No deletions — kept for PoRBFT v2 fallback until testnet validation - Task #119 (flag removal) deferred to post-testnet validation
Reflexion finding: the getBlockTimestamp case block lacked its own deprecation comment — only covered indirectly by the getValidatorPhase group comment. Added direct marker for clarity.
- Add test:petri script to package.json (bun run test:petri) - Register Petri consensus in TESTING_MAP.md coverage map
TR1: Configurable Petri params - Add PETRI_CONSENSUS, PETRI_FORGE_INTERVAL_MS, PETRI_BLOCK_INTERVAL_MS, PETRI_AGREEMENT_THRESHOLD, PETRI_PROBLEMATIC_TTL_ROUNDS, PETRI_SHARD_SIZE to envKeys.ts - Add PetriConsensusConfig interface to config/types.ts - Add petri defaults to config/defaults.ts - Wire through config/loader.ts with envBool/envInt - Load into getSharedState.petriConsensus and petriConfig in index.ts TR2: Docker devnet wiring - Add PETRI_CONSENSUS env var to all 4 nodes in docker-compose.yml - Add PETRI_CONSENSUS to devnet/.env.example (default: false) - Set PETRI_CONSENSUS=true in .env to enable Petri on devnet
Register petri_block_production and petri_tx_inclusion scenarios in loadgen main.ts so they are discoverable via run-scenario.sh and the petri suite in run-suite.ts.
Submits TX to node-1 and verifies propagation to all other nodes via getTx and getTransactionFinality, ensuring Petri consensus relay layer works across the full cluster.
Registers and collects petri_enabled, petri_forge_running, petri_forge_paused, petri_forge_round, petri_pending_tx_count, and petri_tracker_tx_count gauges for Prometheus observability.
Sends sustained load over configurable rounds, measuring TX throughput, soft/hard finality latency (p50/p95/p99), block production rate, and error rate. Outputs baseline JSON summary.
blockNumber=0 (genesis) is a valid reference but was filtered out by truthy check. Change to !== undefined in both getAll() and getByClassification(). Closes mycelium #142 (Q-5).
trackedCount is a getter property, not a function — the typeof check for "function" always failed and petri_tracker_tx_count was never set. Add getTrackerCount() on ContinuousForge and use it in MetricsCollector instead of the unsafe (forge as any).tracker cast. Closes mycelium #143 (Q-4/CR-3).
A resolved TX from arbitration could still be in the mempool, causing duplicates in the block. Use a hash-keyed Map to deduplicate before ordering. Closes mycelium #144 (CR-4).
If getCommonValidatorSeed() or getShard() threw before entering petriConsensusRoutine, startingConsensus stayed true permanently, blocking all future consensus rounds. Wrap in try/finally. Closes mycelium #136 (CR-17).
Mempool cleanup used transactionEntities (all TXs) instead of committedTxHashes (only successfully inserted). Skipped TXs were silently removed from mempool despite not being persisted. Closes mycelium #138 (Q-3).
Unhandled exceptions from getTransactionFinality() propagated without a structured error response, unlike neighboring RPC cases. Closes mycelium #139 (CR-16).
If a round had no TO_APPROVE txs and returned early, stale deltas from the previous round were served via getCurrentDeltas(). Closes mycelium #141 (CR-10).
There was a problem hiding this comment.
Actionable comments posted: 8
♻️ Duplicate comments (1)
src/libs/consensus/petri/forge/continuousForge.ts (1)
154-173:⚠️ Potential issue | 🟠 MajorReset
currentRoundDeltasbefore any early return.If a round finds no
TO_APPROVEtxs or aborts before Line 203,getCurrentDeltas()keeps serving the previous round's hashes. Thepetri_exchangeDeltashandler currently reads that map without a round guard, so peers can record stale deltas against the new round.🔧 Suggested fix
async runForgeRound(): Promise<void> { this.state.currentRound++ this.state.lastRoundStartedAt = Date.now() const round = this.state.currentRound + this.currentRoundDeltas = {} log.debug(`[ContinuousForge] Round ${round} starting`)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/libs/consensus/petri/forge/continuousForge.ts` around lines 154 - 173, The issue is stale per-round deltas persisting when runForgeRound returns early; in runForgeRound (continuousForge.ts) reset/clear the currentRoundDeltas map at the start of the round (or immediately before any early return) so getCurrentDeltas() and the petri_exchangeDeltas handler never serve previous-round hashes; update the method that maintains currentRoundDeltas (referenced as currentRoundDeltas and getCurrentDeltas()) inside runForgeRound to clear it (e.g., new Map() or clear()) before checking TO_APPROVE txs or aborting.
🧹 Nitpick comments (4)
testing/loadgen/src/features/consensus/petri_tx_inclusion.ts (3)
201-206: Latency calculation may be affected by clock skew.
softFinalityAtandhardFinalityAtare server-side timestamps from the blockchain, whiletxSubmittedAtis a client-sideDate.now(). Clock drift between client and server could produce misleading or negative latency values.For devnet testing this is likely acceptable, but worth noting if these metrics are used for performance analysis.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@testing/loadgen/src/features/consensus/petri_tx_inclusion.ts` around lines 201 - 206, Latency fields softFinalityLatencyMs and hardFinalityLatencyMs use client-side txSubmittedAt which can be skewed vs server finality timestamps (finalityResult.finality.softFinalityAt/hardFinalityAt), causing misleading/negative values; update the calculation in the code that builds these fields (refer to softFinalityLatencyMs, hardFinalityLatencyMs, finalityResult, txSubmittedAt) to either (a) compute latency using a server-side submission timestamp if available (e.g., use finalityResult.submittedAt or a serverTimestamp included in finalityResult) or (b) derive a clock offset by comparing a server time value from the same response and adjust txSubmittedAt before subtraction, and as a safeguard clamp negative results to null (or 0) to avoid reporting negative latencies.
94-94: Consider extracting helper functions to reduce cognitive complexity.SonarCloud reports cognitive complexity of 23, exceeding the allowed threshold of 15. The function handles setup, transaction submission, and multiple verification phases sequentially.
Consider extracting logical sections into helper functions:
submitTransaction(demos, recipientAddress, nonce, amount)verifyInclusion(rpcUrls, txHash, senderAddress, expectedNonce, timeoutSec, pollMs)This would improve readability and testability while reducing complexity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@testing/loadgen/src/features/consensus/petri_tx_inclusion.ts` at line 94, The runPetriTxInclusion function is too cognitively complex; split its setup, submission, and multi-phase verification logic into clear helpers: implement submitTransaction(demos, recipientAddress, nonce, amount) to encapsulate creating and sending the transaction and returning the txHash and any submission metadata, and implement verifyInclusion(rpcUrls, txHash, senderAddress, expectedNonce, timeoutSec, pollMs) to encapsulate the polling/validation across RPC endpoints until the transaction and nonce are observed; refactor runPetriTxInclusion to call these helpers (and additional small helpers for initial setup or teardown if needed) so each helper has a single responsibility and the main function just orchestrates the steps.
133-137: Theas anycasts are unnecessary—confirmandbroadcastmethods are properly typed in the SDK.The same
Demosimport from@kynesyslabs/demosdk/websdkis used intesting/loadgen/src/transfer_loadgen.tswithout any type casting (lines 280 and 284), which proves these methods are available in the SDK's public type definitions. Remove the casts to improve type safety.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@testing/loadgen/src/features/consensus/petri_tx_inclusion.ts` around lines 133 - 137, Remove the unnecessary any casts by calling the strongly-typed SDK methods directly: replace (demos as any).confirm(signedTx) with demos.confirm(signedTx) and (demos as any).broadcast(validity) with demos.broadcast(validity); keep the existing variables (signedTx, validity) and existing error check but use demos.confirm and demos.broadcast to restore type safety and rely on the SDK's public definitions.src/libs/blockchain/mempool_v2.ts (1)
297-300: Preserve the existing mempool tie-breakers in classification queries.
getMempool()already orders bytimestamp,reference_block, thenhash, butgetByClassification()only sorts bytimestamp. Reusing the same tie-breakers avoids nondeterministic iteration when two txs share a timestamp, especially for callers likegetPreApproved()that can feed consensus flows.♻️ Suggested change
return await this.repo.find({ where, - order: { timestamp: "ASC" }, + order: { + timestamp: "ASC", + reference_block: "ASC", + hash: "ASC", + }, })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/libs/blockchain/mempool_v2.ts` around lines 297 - 300, getByClassification() currently orders only by timestamp which can produce nondeterministic ordering when timestamps tie; update the repository query in getByClassification() to reuse the same tie-breaker ordering used by getMempool() (i.e., order by timestamp, then reference_block, then hash, all ASC) so callers like getPreApproved() see deterministic iteration. Locate getByClassification() in mempool_v2.ts and change its this.repo.find(...) options to include the additional order fields timestamp -> reference_block -> hash matching getMempool().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.env.example:
- Around line 88-93: The PETRI_... keys are triggering dotenv-linter
UnorderedKey warnings because they are not in alphabetical order; reorder the
block so the env keys are sorted lexicographically (e.g.,
PETRI_AGREEMENT_THRESHOLD, PETRI_BLOCK_INTERVAL_MS, PETRI_CONSENSUS,
PETRI_FORGE_INTERVAL_MS, PETRI_PROBLEMATIC_TTL_ROUNDS, PETRI_SHARD_SIZE) and
update the .env.example block containing PETRI_CONSENSUS,
PETRI_FORGE_INTERVAL_MS, PETRI_BLOCK_INTERVAL_MS, PETRI_AGREEMENT_THRESHOLD,
PETRI_PROBLEMATIC_TTL_ROUNDS, PETRI_SHARD_SIZE accordingly to resolve the linter
errors.
- Around line 36-38: Reorder the OMNI variables in .env.example so they follow
the dotenv-linter expected alphabetical order: move OMNI_MODE to come
immediately after OMNI_ENABLED and before OMNI_PORT; specifically adjust the
block containing OMNI_ENABLED, OMNI_MODE, and OMNI_PORT to read OMNI_ENABLED,
OMNI_MODE, OMNI_PORT to resolve the UnorderedKey warning.
In `@src/features/metrics/MetricsCollector.ts`:
- Around line 757-761: collectPetriMetrics() is reading
state.pendingTransactions.size which ContinuousForge never populates so
petri_pending_tx_count stays 0; replace this dead source by querying a live
pending-tx count (either add/expose a method like
ContinuousForge.getPendingTransactionCount() that returns the mempool count or
call the existing mempool/mempoolManager API directly) and use that value for
the "petri_pending_tx_count" gauge, or alternatively update
runForgeRound()/ContinuousForge to maintain state.pendingTransactions correctly
if you prefer keeping the gauge tied to the forge state.
In `@src/libs/blockchain/mempool_v2.ts`:
- Around line 117-125: The current logic leaves transactions as
TransactionClassification.TO_APPROVE when executeSpeculatively(...) returns
success: false, causing them to never reach delta agreement; update the handling
in the block that calls executeSpeculatively (the code that reads
result.classification === TransactionClassification.TO_APPROVE and assigns
deltaHash from specResult.delta) so that when specResult.success is false you do
not persist a stuck TO_APPROVE—either reject the transaction or mark it with a
terminal failure classification (e.g., TransactionClassification.TO_REJECT or a
FAILED_SIMULATION status), clear any delta hash, and persist that updated
classification immediately; ensure this mirrors how
ContinuousForge.runForgeRound() gates recording on success && delta so the
mempool won’t retain non-progressing entries.
In `@src/libs/consensus/petri/forge/continuousForge.ts`:
- Around line 166-200: The code only fetches txs already marked
TransactionClassification.TO_APPROVE via Mempool.getByClassification so
transactions inserted without a persisted classification are never processed;
update the flow to also handle unclassified mempool rows by fetching
TransactionClassification.UNCLASSIFIED (or a combined query) alongside
TO_APPROVE, run classifyTransaction for those entries, persist the derived
classification and any computed delta via Mempool.updateClassification, and then
continue with the existing speculative execution path (symbols to edit:
Mempool.getByClassification, TransactionClassification.TO_APPROVE/UNCLASSIFIED,
classifyTransaction, executeSpeculatively, and Mempool.updateClassification).
- Around line 93-95: pause() currently only sets state.isPaused and doesn't wait
for an active round; ensure in ContinuousForge you track the currently running
round promise (e.g., add a member like currentRoundPromise or
activeRoundPromise) and assign it at the start of runForgeRound() and clear it
on completion. Update pause() to set state.isPaused then await the tracked
currentRoundPromise (if present) so it drains the in-flight round before
returning. Also update runBlockPeriod() to check/await the same tracked promise
before proceeding to arbitration/block compilation to prevent mempool rewrites
during an active round. Ensure runForgeRound() uses finally to clear the promise
so pause() and runBlockPeriod() never await a dangling promise.
In `@testing/devnet/.env.example`:
- Around line 24-29: Reorder the PETRI_* entries in the devnet .env example to
satisfy dotenv-linter's UnorderedKey rule by sorting them consistently (e.g.,
alphabetically or matching the project's canonical env order); specifically
locate the block containing PETRI_CONSENSUS, PETRI_FORGE_INTERVAL_MS,
PETRI_BLOCK_INTERVAL_MS, PETRI_AGREEMENT_THRESHOLD,
PETRI_PROBLEMATIC_TTL_ROUNDS, and PETRI_SHARD_SIZE and rearrange those keys into
the agreed ordering (for example: PETRI_AGREEMENT_THRESHOLD,
PETRI_BLOCK_INTERVAL_MS, PETRI_CONSENSUS, PETRI_FORGE_INTERVAL_MS,
PETRI_PROBLEMATIC_TTL_ROUNDS, PETRI_SHARD_SIZE) so dotenv-linter stops reporting
UnorderedKey.
In `@testing/loadgen/src/features/consensus/petri_tx_inclusion.ts`:
- Around line 217-222: The thrown error in petri_tx_inclusion omits a reason
when soft finality isn't observed; update the error-reasons collection logic to
push a message when finalityResult?.softFinalityObserved is false (e.g., add
"soft finality not observed" alongside the existing checks for nonceWait,
blockAdvance, txByHash and finalityResult?.hardFinalityObserved) so the Error
thrown from petri_tx_inclusion includes that failure cause.
---
Duplicate comments:
In `@src/libs/consensus/petri/forge/continuousForge.ts`:
- Around line 154-173: The issue is stale per-round deltas persisting when
runForgeRound returns early; in runForgeRound (continuousForge.ts) reset/clear
the currentRoundDeltas map at the start of the round (or immediately before any
early return) so getCurrentDeltas() and the petri_exchangeDeltas handler never
serve previous-round hashes; update the method that maintains currentRoundDeltas
(referenced as currentRoundDeltas and getCurrentDeltas()) inside runForgeRound
to clear it (e.g., new Map() or clear()) before checking TO_APPROVE txs or
aborting.
---
Nitpick comments:
In `@src/libs/blockchain/mempool_v2.ts`:
- Around line 297-300: getByClassification() currently orders only by timestamp
which can produce nondeterministic ordering when timestamps tie; update the
repository query in getByClassification() to reuse the same tie-breaker ordering
used by getMempool() (i.e., order by timestamp, then reference_block, then hash,
all ASC) so callers like getPreApproved() see deterministic iteration. Locate
getByClassification() in mempool_v2.ts and change its this.repo.find(...)
options to include the additional order fields timestamp -> reference_block ->
hash matching getMempool().
In `@testing/loadgen/src/features/consensus/petri_tx_inclusion.ts`:
- Around line 201-206: Latency fields softFinalityLatencyMs and
hardFinalityLatencyMs use client-side txSubmittedAt which can be skewed vs
server finality timestamps
(finalityResult.finality.softFinalityAt/hardFinalityAt), causing
misleading/negative values; update the calculation in the code that builds these
fields (refer to softFinalityLatencyMs, hardFinalityLatencyMs, finalityResult,
txSubmittedAt) to either (a) compute latency using a server-side submission
timestamp if available (e.g., use finalityResult.submittedAt or a
serverTimestamp included in finalityResult) or (b) derive a clock offset by
comparing a server time value from the same response and adjust txSubmittedAt
before subtraction, and as a safeguard clamp negative results to null (or 0) to
avoid reporting negative latencies.
- Line 94: The runPetriTxInclusion function is too cognitively complex; split
its setup, submission, and multi-phase verification logic into clear helpers:
implement submitTransaction(demos, recipientAddress, nonce, amount) to
encapsulate creating and sending the transaction and returning the txHash and
any submission metadata, and implement verifyInclusion(rpcUrls, txHash,
senderAddress, expectedNonce, timeoutSec, pollMs) to encapsulate the
polling/validation across RPC endpoints until the transaction and nonce are
observed; refactor runPetriTxInclusion to call these helpers (and additional
small helpers for initial setup or teardown if needed) so each helper has a
single responsibility and the main function just orchestrates the steps.
- Around line 133-137: Remove the unnecessary any casts by calling the
strongly-typed SDK methods directly: replace (demos as any).confirm(signedTx)
with demos.confirm(signedTx) and (demos as any).broadcast(validity) with
demos.broadcast(validity); keep the existing variables (signedTx, validity) and
existing error check but use demos.confirm and demos.broadcast to restore type
safety and rely on the SDK's public definitions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3167fdf1-44b0-44ff-907e-e01c2c6bb126
📒 Files selected for processing (9)
.env.examplesrc/config/loader.tssrc/features/metrics/MetricsCollector.tssrc/libs/blockchain/mempool_v2.tssrc/libs/consensus/petri/block/petriBlockCompiler.tssrc/libs/consensus/petri/forge/continuousForge.tssrc/libs/consensus/petri/types/stateDelta.tstesting/devnet/.env.exampletesting/loadgen/src/features/consensus/petri_tx_inclusion.ts
✅ Files skipped from review due to trivial changes (1)
- src/libs/consensus/petri/types/stateDelta.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- src/libs/consensus/petri/block/petriBlockCompiler.ts
- src/config/loader.ts
…shes
Track processed pubkeys in a dedicated Set instead of checking only the
signatures map. Previously, mismatched or invalid-signature submissions
were never added to signatures{} so they got re-evaluated on every poll
iteration, inflating mismatchCount and potentially preventing the
threshold from being reached even when enough valid signatures existed.
When petriRelay() returns success=false, the TX was silently dropped — the originating node neither relayed it nor added it to its own mempool. Now falls back to Mempool.addTransaction() so the TX is preserved for inclusion in the next block compiled locally.
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (3)
src/utilities/mainLoop.ts (1)
4-6: Consider aligning import path style.Line 4 uses the
@/alias while lines 5-6 (and most other imports in this file) use thesrc/prefix. For consistency, consider using the same path style throughout.Suggested diff
-import { petriConsensusRoutine } from "@/libs/consensus/petri" +import { petriConsensusRoutine } from "src/libs/consensus/petri"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utilities/mainLoop.ts` around lines 4 - 6, The imports in mainLoop.ts mix alias styles — change the import for petriConsensusRoutine to match the prevailing "src/" prefix (or conversely change other imports if you prefer "@/"), so update the import line referencing petriConsensusRoutine to use the same path style as getCommonValidatorSeed and getShard; locate the import for petriConsensusRoutine and replace "@/libs/consensus/petri" with the consistent "src/libs/consensus/petri" (or update the other two to "@/libs/consensus/v2/routines/...") ensuring all imports in this file use the same alias form.src/libs/network/endpointExecution.ts (2)
33-33: Inconsistent import path alias.This import uses
@/libs/...while all other local imports in this file usesrc/libs/...(e.g., lines 1, 2, 4, 5, 16, 18, 31). Consider using a consistent path style for maintainability.💡 Suggested fix for consistency
-import { relay as petriRelay } from "@/libs/consensus/petri/routing/petriRouter" +import { relay as petriRelay } from "src/libs/consensus/petri/routing/petriRouter"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/libs/network/endpointExecution.ts` at line 33, The import for petriRelay currently uses the alias "@/libs/consensus/petri/routing/petriRouter" which is inconsistent with the other local imports in this file; update the import to use the same path style as the rest (e.g., change to "src/libs/consensus/petri/routing/petriRouter") so the symbol petriRelay is imported via the consistent path convention.
335-337: Response message is misleading after fallback.When
relaySuccessisfalse, the message says "relay pending", but the transaction has actually been added to the local mempool (not pending relay). Consider updating the message for accuracy.📝 Suggested message clarification
message: relaySuccess ? "Transaction routed to shard members" - : "Transaction accepted locally (relay pending)", + : "Transaction accepted locally (added to mempool)",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/libs/network/endpointExecution.ts` around lines 335 - 337, The response message uses relaySuccess to choose between "Transaction routed to shard members" and "Transaction accepted locally (relay pending)" which is inaccurate when relay fails; update the conditional that sets the message (the message property assigned based on relaySuccess in endpointExecution.ts) so the false branch accurately reflects local mempool acceptance — e.g., change to "Transaction accepted locally (added to local mempool)" or similar wording to replace "relay pending".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/libs/consensus/petri/forge/continuousForge.ts`:
- Around line 228-240: The loop in continuousForge.ts calling
Mempool.updateClassification for each txHash (setting
TransactionClassification.PRE_APPROVED and PROBLEMATIC) can race with concurrent
forge rounds or block compilation because Mempool.updateClassification uses
repo.update() without transaction isolation; change the implementation to
perform these classification changes in a single atomic operation—either add a
batched repository method (e.g., Mempool.updateClassifications or
repo.updateMany) that wraps all txHash updates in a DB transaction, or implement
optimistic locking/version checks in Mempool.updateClassification
(compare-and-swap on a version/timestamp) so concurrent rounds cannot overwrite
each other; update continuousForge.ts to call the new batched/transactional API
instead of per-tx await calls.
- Around line 127-129: getState currently returns a shallow copy of this.state
which leaks the internal pendingTransactions Map; callers can mutate the Map via
the returned object. Fix by returning a new ForgeState where pendingTransactions
is cloned (e.g., construct a new Map from this.state.pendingTransactions) and
keep other fields copied (use the existing spread of this.state but replace
pendingTransactions), so the returned object contains a fresh Map instance and
does not expose internal mutable state; update getState to return that cloned
Map version.
- Around line 280-284: Validate and sanitize peer responses before trusting the
shape used by recordDelta: when handling response.response in continuousForge
(the block assigning to peerDeltas[peer.identity]), check that response.response
is an object, that it has a deltas property which is an object, and that each
key and value in deltas is a string (or otherwise matches the expected type)
before adding it to peerDeltas; if validation fails, log/warn and skip that
peer's deltas to avoid passing malformed data into recordDelta.
- Around line 266-294: The delta exchange in continuousForge.ts currently builds
promises from peers (using peer.longCall) and awaits Promise.all, which can hang
a round if any peer call never resolves; wrap each peer.longCall (or the entire
Promise.all) in a Promise.race that rejects/resolves after an absolute timeout
(e.g., 2-3s) so dangling calls won't block the round, ensure you only write into
peerDeltas from the timed/settled responses (the promises array where
peer.longCall is invoked), and keep existing error handling (log.warn) for
timeouts so late responses are ignored and the function returns promptly with
whatever peerDeltas were gathered.
- Around line 187-201: Remove the unnecessary double-cast of mempoolTx to
Transaction: pass the existing mempoolTx (already typed as MempoolTx which
implements Transaction) directly into classifyTransaction and
executeSpeculatively instead of using "as unknown as Transaction"; keep the
existing logic that checks classResult.classification ===
TransactionClassification.TO_APPROVE, stores specResult.delta.hash into
localDeltas[mempoolTx.hash], and calls
Mempool.updateClassification(mempoolTx.hash,
TransactionClassification.TO_APPROVE, specResult.delta.hash).
In `@src/libs/network/endpointExecution.ts`:
- Around line 326-329: The fallback call to Mempool.addTransaction (using
queriedTx and validatedData.data.reference_block) lacks error handling and can
cause silent loss while the function still returns success: wrap that await in a
try/catch, log the error (including context like the transaction id or
queriedTx), and set the response/result flag to failure (e.g., result.success =
false or equivalent) - mirror the error handling approach used in the default
mempool insertion path (lines that handle errors and set result.success = false)
so failures are propagated correctly.
---
Nitpick comments:
In `@src/libs/network/endpointExecution.ts`:
- Line 33: The import for petriRelay currently uses the alias
"@/libs/consensus/petri/routing/petriRouter" which is inconsistent with the
other local imports in this file; update the import to use the same path style
as the rest (e.g., change to "src/libs/consensus/petri/routing/petriRouter") so
the symbol petriRelay is imported via the consistent path convention.
- Around line 335-337: The response message uses relaySuccess to choose between
"Transaction routed to shard members" and "Transaction accepted locally (relay
pending)" which is inaccurate when relay fails; update the conditional that sets
the message (the message property assigned based on relaySuccess in
endpointExecution.ts) so the false branch accurately reflects local mempool
acceptance — e.g., change to "Transaction accepted locally (added to local
mempool)" or similar wording to replace "relay pending".
In `@src/utilities/mainLoop.ts`:
- Around line 4-6: The imports in mainLoop.ts mix alias styles — change the
import for petriConsensusRoutine to match the prevailing "src/" prefix (or
conversely change other imports if you prefer "@/"), so update the import line
referencing petriConsensusRoutine to use the same path style as
getCommonValidatorSeed and getShard; locate the import for petriConsensusRoutine
and replace "@/libs/consensus/petri" with the consistent
"src/libs/consensus/petri" (or update the other two to
"@/libs/consensus/v2/routines/...") ensuring all imports in this file use the
same alias form.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2066d833-ec81-4566-9067-912f2120ee9f
📒 Files selected for processing (6)
src/libs/blockchain/chainBlocks.tssrc/libs/consensus/petri/coordination/petriSecretary.tssrc/libs/consensus/petri/forge/continuousForge.tssrc/libs/network/endpointExecution.tssrc/libs/network/rpcDispatch.tssrc/utilities/mainLoop.ts
✅ Files skipped from review due to trivial changes (1)
- src/libs/consensus/petri/coordination/petriSecretary.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- src/libs/network/rpcDispatch.ts
- src/libs/blockchain/chainBlocks.ts
- Remove unused beforeEach import in happyPath.test.ts - Remove redundant type annotations in byzantineFault.test.ts - Reorder OMNI_ keys alphabetically in .env.example - Add try/catch around fallback mempool insertion in endpointExecution.ts
There was a problem hiding this comment.
♻️ Duplicate comments (2)
.env.example (1)
88-93:⚠️ Potential issue | 🟡 MinorReorder PETRI keys to satisfy dotenv-linter.
Line 90 and Line 91 still trigger
UnorderedKey; this block should be lexicographically ordered.Proposed fix
-PETRI_CONSENSUS=true -PETRI_FORGE_INTERVAL_MS=2000 -PETRI_BLOCK_INTERVAL_MS=10000 -PETRI_AGREEMENT_THRESHOLD=7 +PETRI_AGREEMENT_THRESHOLD=7 +PETRI_BLOCK_INTERVAL_MS=10000 +PETRI_CONSENSUS=true +PETRI_FORGE_INTERVAL_MS=2000 PETRI_PROBLEMATIC_TTL_ROUNDS=5 PETRI_SHARD_SIZE=10🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.env.example around lines 88 - 93, Reorder the PETRI environment keys in the .env.example so they are lexicographically sorted to satisfy dotenv-linter: arrange the lines with PETRI_AGREEMENT_THRESHOLD first, then PETRI_BLOCK_INTERVAL_MS, PETRI_CONSENSUS, PETRI_FORGE_INTERVAL_MS, PETRI_PROBLEMATIC_TTL_ROUNDS, and PETRI_SHARD_SIZE, keeping their values unchanged.src/libs/network/endpointExecution.ts (1)
326-343:⚠️ Potential issue | 🟠 MajorIncomplete error handling in fallback path — transaction can still be lost with
success: truereturned.The fallback path has two issues:
Return value not checked:
Mempool.addTransactionreturns{ error }for cases like "Transaction already executed" or "Transaction already in mempool" (see lines 415-430 for the correct pattern). The current code only catches thrown exceptions.Success always returned: Whether the mempool insertion fails via exception or returns an error, the function still returns
success: trueat line 339, misleading the client.🛡️ Proposed fix to match the default mempool path error handling
if (!relaySuccess) { // Fallback: add to local mempool so the TX is not lost log.warn( `[handleExecuteTransaction] Petri relay failed for ${queriedTx.hash}, adding to local mempool`, ) try { - await Mempool.addTransaction({ + const { error } = await Mempool.addTransaction({ ...queriedTx, reference_block: validatedData.data.reference_block, }) + if (error) { + log.error( + `[handleExecuteTransaction] Fallback mempool add returned error for ${queriedTx.hash}: ${error}`, + ) + return { + success: false, + response: { message: "Failed to add transaction to mempool" }, + extra: { error, routing: "petri" }, + require_reply: false, + } + } } catch (mempoolError) { log.error( `[handleExecuteTransaction] Fallback mempool insertion also failed for ${queriedTx.hash}: ${mempoolError instanceof Error ? mempoolError.message : String(mempoolError)}`, ) + return { + success: false, + response: { message: "Failed to add transaction to mempool" }, + extra: { error: String(mempoolError), routing: "petri" }, + require_reply: false, + } } } return { success: true, response: { message: relaySuccess ? "Transaction routed to shard members" - : "Transaction accepted locally (relay pending)", + : "Transaction accepted locally (added to mempool)", },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/libs/network/endpointExecution.ts` around lines 326 - 343, The fallback mempool insertion currently only catches thrown exceptions and always returns success: true; update the fallback where Mempool.addTransaction is called (the block inside handleExecuteTransaction that logs `[handleExecuteTransaction] Fallback mempool insertion...`) to inspect the returned value from Mempool.addTransaction (not just exceptions), mirror the pattern used in the default mempool path (check for a returned { error } and handle "Transaction already executed" / "Transaction already in mempool" cases), and if an error is returned set success: false and return an appropriate response payload (or propagate the error) instead of unconditionally returning success: true.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In @.env.example:
- Around line 88-93: Reorder the PETRI environment keys in the .env.example so
they are lexicographically sorted to satisfy dotenv-linter: arrange the lines
with PETRI_AGREEMENT_THRESHOLD first, then PETRI_BLOCK_INTERVAL_MS,
PETRI_CONSENSUS, PETRI_FORGE_INTERVAL_MS, PETRI_PROBLEMATIC_TTL_ROUNDS, and
PETRI_SHARD_SIZE, keeping their values unchanged.
In `@src/libs/network/endpointExecution.ts`:
- Around line 326-343: The fallback mempool insertion currently only catches
thrown exceptions and always returns success: true; update the fallback where
Mempool.addTransaction is called (the block inside handleExecuteTransaction that
logs `[handleExecuteTransaction] Fallback mempool insertion...`) to inspect the
returned value from Mempool.addTransaction (not just exceptions), mirror the
pattern used in the default mempool path (check for a returned { error } and
handle "Transaction already executed" / "Transaction already in mempool" cases),
and if an error is returned set success: false and return an appropriate
response payload (or propagate the error) instead of unconditionally returning
success: true.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 5a7cbcfc-b840-407c-ace9-09e54704d172
📒 Files selected for processing (4)
.env.examplesrc/libs/network/endpointExecution.tstesting/petri/byzantineFault.test.tstesting/petri/happyPath.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- testing/petri/happyPath.test.ts
- testing/petri/byzantineFault.test.ts
Pass forgeStartedAt timestamp to runBlockPeriod and sleep only the remainder of blockIntervalMs after subtracting forge startup overhead. Prevents cumulative drift across rounds.
Replace hard-coded 5s ceiling with petriConfig.blockIntervalMs so the wait scales with the configured block interval.
Replace naive shard[0] stub with production-matching electSecretary() that sorts all identities alphabetically. Update test data to use identities that produce deterministic sorted order.
Wrap Promise.all in Promise.race with forgeIntervalMs timeout to prevent round stalls from slow/dead peers. Also validate that peer response deltas is a non-array object before using it.
Log a specific reason for soft finality failure alongside the existing hard finality reason in petri_tx_inclusion error output.
Previously, if executeSpeculatively() failed, the TX stayed classified as TO_APPROVE with no deltaHash, causing it to be stuck in the mempool forever. Now marks it as FAILED so the forge skips it.
Add drain() method to ContinuousForge that pauses and awaits any currently running round. Track the round promise via currentRoundPromise. Replace forge.pause() with await forge.drain() in runBlockPeriod so block compilation never races with an in-flight forge round.
TXs arriving via mergeMempools from shard peers have no classification. Add Mempool.getUnclassified() (IsNull query) and classify+speculate them at the top of each forge round before querying TO_APPROVE rows. This ensures merged TXs participate in delta agreement.
Satisfies S6544 — avoid truthy evaluation on a Promise type.
|
This PR modifies Files that will be reverted:
|
…isation-into-petri # Conflicts: # testing/devnet/docker-compose.yml
…rge report speculativeExecutor was still using the old Repository-based GCR routine calls (GCRBalanceRoutines.apply(edit, repo, simulate)) which broke after the stabilisation GCR refactor. Updated to use HandleGCR.prepareAccounts() + HandleGCR.applyTransaction() pattern consistent with all other callers. Added MERGE_REPORT_stabilisation_into_petri.md documenting: - All 9 stabilisation hotfixes merged - Conflict resolutions (port staggering) - Code adaptations needed and completed - Follow-up action items https://claude.ai/code/session_01WWNp94cGgFCEVWmTtE2wvV
…ge reports omniprotocol/handlers/gcr.ts was passing Repository<GCRMain> to GCRIdentityRoutines.apply() which now expects a GCRMain entity. Load the entity first, pass it, and persist after successful apply. This was the only merge-introduced type regression (verified via diff against petri baseline — 69 pre-existing errors unchanged). https://claude.ai/code/session_01WWNp94cGgFCEVWmTtE2wvV
Merge stabilisation GCR hotfixes into petri
|
This PR modifies Files that will be reverted:
|
|



Summary by CodeRabbit
New Features
Documentation
Tests
Chores