perf: optimize block creation by alpe · Pull Request #3093 · evstack/ev-node

alpe · 2026-02-19T15:49:24Z

Bring moonshot 2 optimizations to main

Performance Optimizations Analysis (alex/bench_phase1_start vs HEAD)

Summary: Targeted optimizations focused on reducing CPU overhead (marshalling/hashing) and I/O latency (store reads) in the block production hot path.

1. Executor: In-Memory Caching for Block Production

Change: Executor now caches lastHeaderHash, lastDataHash, and lastSignature in memory after committing a block.
Why: Eliminates the need to call store.GetBlockData and store.GetSignature (database I/O) at the start of CreateBlock. This significantly speeds up the "cold start" of a new block round.

2. Executor: Avoid Re-hashing

Change: CreateBlock reuses newState.LastHeaderHash (computed during state transition) instead of calling header.Hash() again before broadcasting.
Why: header.Hash() triggers a full Protobuf marshal and SHA-256 hash. Reusing the already-computed hash saves CPU cycles.

3. Executor: Optimized Metrics Collection

Change: Replaced data.Size() with data.TxsByteSize() for the BlockSizeBytes metric.
Why: data.Size() marshals the entire object to Protobuf to get the size. data.TxsByteSize() simply sums the lengths of the transaction byte slices, avoiding expensive allocations and serialization.

4. Types: Cached Public Key Marshalling

Change: Signer struct now caches the marshalled public key bytes (marshalledPubKey).
Why: ToProto methods (called frequently during P2P exchange and storage) used to call crypto.MarshalPublicKey every time. Caching this eliminates repeated serialization overhead.

5. Types: Lightweight Sequence Validation

Change: Introduced AssertValidSequence in State.
Why: Provides a "fast path" validation for self-produced blocks. It checks simple fields (Height, Time, Hash Chaining) but skips the expensive Validate() call, which requires re-marshalling and re-hashing the entire block data to verify the DACommitment.

6. Store/Executor: Pre-serialized writes (`SaveBlockDataFromBytes`)

Change: Added SaveBlockDataFromBytes to the Store interface; signHeader now returns the serialized header bytes.
Why: Allows the Executor to write the block to disk using the exact bytes it already generated for signing/broadcasting, avoiding a redundant MarshalBinary call at the storage layer.

7. Protobuf: Zero-Allocation Timestamp Conversion

Change: In State.ToProto, manually constructed &timestamppb.Timestamp{...}.
Why: Replaces timestamppb.New(time), which performs a heap allocation. Small optimization for a high-frequency operation.

8. Executor: Sequential P2P Broadcast

Change: Changed Header and Data broadcast from parallel (errgroup) to sequential (Header then Data).
Why: Correctness fix that also aids performance stability. Prevents peers from rejecting Data because they haven't received the Header yet, preventing network churn and retries.

go test -bench=BenchmarkProduceBlock -count=5 -benchmem -run='^$'  ./block/internal/executing/...

This branch:

goos: darwin
goarch: arm64
pkg: github.com/evstack/ev-node/block/internal/executing
cpu: Apple M2 Max
BenchmarkProduceBlock/single_tx-12                 43686             26992 ns/op            7205 B/op         80 allocs/op
BenchmarkProduceBlock/single_tx-12                 45000             27086 ns/op            7305 B/op         80 allocs/op
BenchmarkProduceBlock/single_tx-12                 44564             27181 ns/op            7263 B/op         80 allocs/op
BenchmarkProduceBlock/single_tx-12                 44956             27089 ns/op            7295 B/op         80 allocs/op
BenchmarkProduceBlock/single_tx-12                 44187             27195 ns/op            7234 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   36262             32719 ns/op           25766 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   36657             32518 ns/op           25758 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   37702             32044 ns/op           25739 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   37425             31974 ns/op           25744 B/op         80 allocs/op
BenchmarkProduceBlock/100_txs-12                   37120             32146 ns/op           25750 B/op         80 allocs/op
BenchmarkProduceBlock/empty_batch-12               46038             26368 ns/op            6992 B/op         70 allocs/op
BenchmarkProduceBlock/empty_batch-12               45804             26304 ns/op            6944 B/op         70 allocs/op
BenchmarkProduceBlock/empty_batch-12               45894             26368 ns/op            6975 B/op         70 allocs/op
BenchmarkProduceBlock/empty_batch-12               46137             26293 ns/op            6986 B/op         70 allocs/op
BenchmarkProduceBlock/empty_batch-12               44925             26837 ns/op            6846 B/op         70 allocs/op
PASS
ok      github.com/evstack/ev-node/block/internal/executing     23.345s

base alex/bench_phase1_start

goos: darwin
goarch: arm64
pkg: github.com/evstack/ev-node/block/internal/executing
cpu: Apple M2 Max
BenchmarkProduceBlock/empty_batch-12               14835             80996 ns/op           11042 B/op        155 allocs/op
BenchmarkProduceBlock/empty_batch-12               14977             80256 ns/op           11031 B/op        154 allocs/op
BenchmarkProduceBlock/empty_batch-12               15104             81055 ns/op           11024 B/op        154 allocs/op
BenchmarkProduceBlock/empty_batch-12               14895             81077 ns/op           11035 B/op        154 allocs/op
BenchmarkProduceBlock/empty_batch-12               14671             80922 ns/op           11048 B/op        154 allocs/op
BenchmarkProduceBlock/single_tx-12                 14695             81520 ns/op           11605 B/op        171 allocs/op
BenchmarkProduceBlock/single_tx-12                 14871             81326 ns/op           11592 B/op        170 allocs/op
BenchmarkProduceBlock/single_tx-12                 14636             81783 ns/op           11605 B/op        170 allocs/op
BenchmarkProduceBlock/single_tx-12                 14800             81486 ns/op           11595 B/op        170 allocs/op
BenchmarkProduceBlock/single_tx-12                 14617             81957 ns/op           11606 B/op        170 allocs/op
BenchmarkProduceBlock/100_txs-12                   12792             93848 ns/op           47152 B/op        277 allocs/op
BenchmarkProduceBlock/100_txs-12                   12963             93155 ns/op           47128 B/op        277 allocs/op
BenchmarkProduceBlock/100_txs-12                   13046             92560 ns/op           47121 B/op        277 allocs/op
BenchmarkProduceBlock/100_txs-12                   12789             93442 ns/op           47141 B/op        277 allocs/op
BenchmarkProduceBlock/100_txs-12                   12891             93534 ns/op           47133 B/op        277 allocs/op
PASS
ok      github.com/evstack/ev-node/block/internal/executing     31.700s

… executor comments.

github-actions · 2026-02-19T15:49:42Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Feb 20, 2026, 12:34 PM

gemini-code-assist · 2026-02-19T15:49:52Z

Summary of Changes

Hello @alpe, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on significant performance improvements for block execution and storage operations. By introducing caching mechanisms, optimizing data serialization, and refining key generation, the changes aim to reduce CPU and memory overhead, particularly in high-throughput scenarios. The modifications also streamline the block production pipeline by avoiding redundant computations and ensuring correct P2P broadcast ordering, contributing to a more efficient and robust node operation.

Highlights

Performance Optimizations: Implemented several performance enhancements across block execution, storage, and serialization paths. This includes reducing memory allocations during key generation and protobuf marshalling, and optimizing data access patterns.
Caching and Amortization: Introduced in-memory caching for last block data (header hash, data hash, signature) to avoid redundant store reads during block creation. Pending block checks are now amortized to reduce the frequency of expensive store scans.
Streamlined Block Production: Refactored block production logic to use a new SaveBlockDataFromBytes method, allowing pre-serialized data to be saved directly. Validation for self-produced blocks now uses a lightweight AssertValidSequence to skip expensive re-computations.
P2P Broadcast Ordering: Ensured sequential broadcasting of block headers before block data to the P2P network, addressing potential validation failures due to out-of-order delivery on peers.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

block/internal/executing/executor.go
- Removed errgroup import to simplify concurrency model.
- Added hasPendingBlock, lastHeaderHash, lastDataHash, lastSignature, and pendingCheckCounter fields to Executor for caching and amortized checks.
- Initialized and warmed the new last-block cache fields in initializeState.
- Modified ProduceBlock to amortize pending limit checks, reducing store scans.
- Optimized ProduceBlock to use hasPendingBlock flag to conditionally check for pending blocks in the store.
- Updated signHeader call in ProduceBlock to retrieve serialized header bytes, avoiding re-marshalling.
- Replaced blockProducer.ValidateBlock with currentState.AssertValidSequence for lightweight validation of self-produced blocks.
- Removed deletePendingBlock call and explicitly set hasPendingBlock to false after commit.
- Updated the in-memory last-block cache after successful block production.
- Changed P2P broadcasting from parallel errgroup to sequential, ensuring header broadcast before data.
- Implemented a fast path in CreateBlock to use cached last block data, falling back to store reads if the cache is empty.
- Optimized ApplyBlock to conditionally allocate rawTxs slice and used a dedicated context for transaction execution.
- Modified signHeader to return both the signature and the serialized header bytes.
- Updated recordBlockMetrics to use data.TxsByteSize() for more accurate block size metrics.
pkg/store/batch.go
- Updated ds.NewKey calls to ds.RawKey for direct key usage.
- Introduced SaveBlockDataFromBytes to save pre-serialized block data, avoiding redundant marshalling.
- Refactored SaveBlockData to utilize the new SaveBlockDataFromBytes method.
pkg/store/keys.go
- Added heightKey helper function for efficient, allocation-minimal key generation for height-based keys.
- Updated GetHeaderKey, getDataKey, GetSignatureKey, and getStateAtHeightKey to use the new heightKey function.
- Simplified getHeightKey to directly return a constant string.
pkg/store/kv.go
- Removed unused path import.
- Refactored GenerateKey to use strings.Builder for more efficient string concatenation and reduced allocations.
pkg/store/tracing.go
- Added tracing instrumentation for the new SaveBlockDataFromBytes method.
pkg/store/tracing_test.go
- Added a mock implementation for the SaveBlockDataFromBytes method for testing purposes.
pkg/store/types.go
- Added SaveBlockDataFromBytes to the Batch interface definition.
types/data.go
- Added TxsByteSize() method to calculate the total byte size of transactions without protobuf marshalling, for cheaper metrics.
types/serialization.go
- Modified SignedHeader.ToProto and SignedData.ToProto to use a cached marshalled public key from the Signer struct.
- Updated SignedHeader.FromProto and SignedData.FromProto to populate the cached marshalled public key.
- Optimized State.ToProto to avoid timestamppb.New allocation by directly constructing the timestamppb.Timestamp.
types/signer.go
- Added marshalledPubKey field to Signer struct to cache marshalled public key bytes.
- Implemented MarshalledPubKey() method to retrieve the cached public key or marshal it if not present.
- Updated NewSigner to pre-cache the marshalled public key upon creation.
types/state.go
- Added AssertValidSequence method for lightweight structural validation of self-produced blocks, bypassing expensive DACommitment re-computation.

Activity

No specific review comments or activity have been recorded for this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

claude · 2026-02-19T15:49:55Z

Claude finished @alpe's task in 4m 36s —— View job

Code Review

Tasks

Gather context and understand request
Read changed files: executor.go, types, store packages
Review code quality, correctness, and architecture
Review performance and resource efficiency
Review testing and reliability
Post final review

Great set of optimizations with solid benchmarks backing them up. The ~3x throughput improvement and ~50% allocation reduction are impressive. A few issues need addressing before merge:

Bugs / Correctness

1. Optimization #6 is incomplete — signHeader bytes are discarded [block/internal/executing/executor.go:534]

signHeader was refactored to return (Signature, []byte, error) specifically to allow the caller to pass pre-serialized bytes to SaveBlockDataFromBytes. However the bytes are silently discarded with _ and the old SaveBlockData (which re-marshals) is used:

// line 534
signature, _, err := e.signHeader(&header.Header)
// ...
// line 553
if err := batch.SaveBlockData(header, data, &signature); err != nil {

The optimization is wired halfway — the interface change and signHeader change happened, but the caller was never updated to use SaveBlockDataFromBytes. Either complete the wiring or revert signHeader to return only (Signature, error) to avoid the misleading dead return value.

Note: this only works if AggregatorNodeSignatureBytesProvider produces the same bytes that header.MarshalBinary() would produce. If the signing payload is different (e.g., ev-abci uses a custom provider), then the bytes cannot be reused for storage and this optimization may not be universally applicable. Fix this →

2. lastBlockInfo cache not invalidated on rollback [block/internal/executing/executor.go:601–605]

e.lastBlockInfo.Store(&lastBlockInfo{
    headerHash: newState.LastHeaderHash,
    dataHash:   data.Hash(),
    signature:  signature,
})

If store.Rollback is triggered (e.g., during replay/sync), the in-memory lastBlockInfo and lastState caches would be stale. A subsequent CreateBlock using the cached values would build on a rolled-back block, producing a chain fork. Is there a guarantee that Rollback is never called while the executor is running? If not, this cache needs to be invalidated on rollback or the rollback path needs to update these caches.

Thread Safety

3. Signer.MarshalledPubKey() lazy init without synchronization [types/signer.go:19–32]

func (s *Signer) MarshalledPubKey() ([]byte, error) {
    if len(s.marshalledPubKey) > 0 {
        return s.marshalledPubKey, nil
    }
    // ...
    s.marshalledPubKey = bz  // unsynchronized write
    return bz, nil
}

This is a check-then-act without synchronization. If the same SignedHeader is ever accessed from multiple goroutines (e.g., passed to the P2P layer which may read it concurrently), this is a data race. In the current sequential broadcast path it's likely safe, but it's fragile. Using sync.Once or atomic.Pointer would be safer, or simply ensuring marshalledPubKey is always populated at construction time (as NewSigner does). Note that CreateBlock constructs types.Signer inline without calling NewSigner, so the cache is never pre-populated for locally produced blocks:

// executor.go:738-743
Signer: types.Signer{
    PubKey:  pubKey,        // marshalledPubKey is zero
    Address: e.genesis.ProposerAddress,
},

4. pendingCheckCounter not atomic [block/internal/executing/executor.go:73]

pendingCheckCounter uint64

This field is only accessed from the execution loop goroutine, so it's safe today. But it's a plain field adjacent to atomic.Pointer and atomic.Bool fields, which creates a misleading inconsistency. Consider making it atomic.Uint64 for clarity, or add a comment stating it's only accessed from executionLoop.

Design / Correctness Concerns

5. AssertValidSequence skips DataHash and ValidatorHash [types/state.go:70–94]

The new lightweight validation intentionally skips Validate(header, data) which checks DataHash. This is the right trade-off for self-produced blocks, but ValidatorHash is also not checked. If a validator rotation happened and produced an incorrect ValidatorHash, it would not be caught. The comment says "skip the expensive Validate() / DACommitment() re-computation" but DACommitment checks DataHash, not ValidatorHash. ValidatorHash check is cheap to add and would catch a class of bugs. Worth documenting explicitly which fields are intentionally skipped and why.

6. savePendingBlock never sets hasPendingBlock = true [block/internal/executing/executor.go:514]

if err := e.savePendingBlock(ctx, header, data); err != nil {
    return fmt.Errorf("failed to save block data: %w", err)
}
// hasPendingBlock is still false here

The flag only transitions false → true in initializeState. If the node crashes after savePendingBlock but before batch.Commit, on restart initializeState correctly finds and sets the flag. This is consistent with its purpose as a "crash recovery" indicator. However, the semantics are subtle: hasPendingBlock means "a pending block from a previous crashed cycle exists" not "a pending block is currently being produced". Consider renaming to recoveredPendingBlock or adding a comment to clarify this invariant.

Code Quality

7. signatureHash misleading name [pkg/store/batch.go:66]

signatureHash := *signature

signature is already a types.Signature (i.e., []byte). Naming the copy signatureHash implies it's a hash, but it's the raw signature bytes. This was noted in Gemini's review too. Rename to signatureCopy or sig. Fix this →

8. getHeightKey format inconsistency [pkg/store/keys.go:90–92]

func getHeightKey() string {
    return "/" + heightPrefix
}

All other height-related keys now use the heightKey() helper which produces /prefix/height. But getHeightKey returns /t directly (no height component). This is fine since it's the "current height" key, but the heightPrefix = "t" constant is otherwise only used here, while heightKey() is used for all block-indexed keys. The inconsistency is harmless but the heightPrefix constant is confusingly named given it's only used in getHeightKey.

9. GenerateKey drops path.Clean [pkg/store/kv.go:43–56]

The old implementation presumably used path.Clean. The new strings.Builder version is cleaner and more efficient, but if any caller passes fields with leading/trailing slashes or empty strings, the generated key format changes. All current callers appear to use well-behaved string constants, so this is low risk. A one-line comment noting the inputs are assumed clean would prevent future bugs.

10. timestamppb.Timestamp manual construction [types/serialization.go:377–378]

LastBlockTime: &timestamppb.Timestamp{Seconds: secs, Nanos: nanos},

timestamppb.New() validates that Nanos is in [0, 999999999]. Go's time.Nanosecond() is always in that range, so this is safe. Add a brief comment to make this invariant explicit, since a future refactor replacing time.Nanosecond() with something else could silently produce an invalid protobuf.

Performance (Minor)

11. Sequential broadcast adds latency to block cycle [block/internal/executing/executor.go:607–620]

The correctness argument for sequential broadcast is sound. But this adds the latency of header broadcast to the block production cycle, potentially increasing block times under slow P2P conditions. Consider whether the broadcast should happen in a background goroutine after the batch is committed (with the header-before-data ordering preserved). The current approach is the safest and likely acceptable, but worth profiling under network load.

Testing

12. No test for AssertValidSequence [types/state.go:70]

The new lightweight validation method has no dedicated unit tests. Given it's now on the critical path for every produced block (replacing ValidateBlock), it deserves table-driven tests covering: wrong chain ID, wrong height, time before last block, mismatched last header hash, mismatched app hash, and initial state (empty LastHeaderHash).

13. SaveBlockDataFromBytes has no test [pkg/store/batch.go:62]

The tracing wrapper and mock exist, but there's no test verifying that SaveBlockDataFromBytes produces the same stored result as SaveBlockData with the same inputs. A round-trip test would catch any hash computation differences.

Summary

The performance results are compelling and the architecture is sound. The most actionable items before merge:

Must fix: Item ABCI interface #1 (incomplete optimization wiring / misleading return value)
Should address: Item Peer discovery #2 (cache invalidation on rollback), Item Transaction gossiping #3 (MarshalledPubKey thread safety), Item Initial project setup #12 (test coverage for AssertValidSequence)
Nice to have: Items Mempool #5–Fold the validator roots into the state tree #11

gemini-code-assist

Code Review

The pull request introduces several performance optimizations aimed at reducing redundant store reads and protobuf marshaling/unmarshaling. Key changes include caching lastHeaderHash, lastDataHash, and lastSignature in the Executor struct, amortizing pending limit checks, and introducing SaveBlockDataFromBytes to avoid re-serialization. Additionally, TxsByteSize is added for more efficient metrics reporting, and MarshalledPubKey caches public key serialization. The signHeader function now returns serialized header bytes to prevent redundant marshaling. Overall, these changes should lead to a more efficient block production pipeline.

block/internal/executing/executor.go

gemini-code-assist · 2026-02-19T15:52:02Z

block/internal/executing/executor.go

 	"github.com/libp2p/go-libp2p/core/crypto"
 	"github.com/rs/zerolog"
-	"golang.org/x/sync/errgroup"



The golang.org/x/sync/errgroup import was removed, but the corresponding usage in ProduceBlock (lines 548-558 in the old code) was replaced with sequential broadcasting. This is a functional change that might impact the overall block production time if broadcasting was previously done concurrently. While the comment on line 598 explains the sequential broadcast, it's worth noting the performance implication of this change.

gemini-code-assist · 2026-02-19T15:52:02Z

block/internal/executing/executor.go

-	// For based sequencer, this will return an empty signature
-	signature, err := e.signHeader(header.Header)
+	// For based sequencer, this will return an empty signature.
+	signature, _, err := e.signHeader(&header.Header)


The signHeader function now returns three values, but only two are being captured here. The headerBytes (second return value) is not being used, which could be passed to batch.SaveBlockDataFromBytes to avoid re-marshalling the header. This would further optimize the block production process.

Suggested change

signature, _, err := e.signHeader(&header.Header)

signature, headerBytes, err := e.signHeader(&header.Header)

gemini-code-assist · 2026-02-19T15:52:02Z

pkg/store/batch.go

-	height := header.Height()
-	signatureHash := *signature
-
 	headerBlob, err := header.MarshalBinary()


The height variable is not used in this function after the refactoring. It can be removed.

gemini-code-assist · 2026-02-19T15:52:02Z

pkg/store/batch.go

-	signatureHash := *signature
-
 	headerBlob, err := header.MarshalBinary()
 	if err != nil {


The signatureHash variable is not used in this function after the refactoring. It can be removed.

gemini-code-assist · 2026-02-19T15:52:02Z

pkg/store/batch.go

+// SaveBlockDataFromBytes saves pre-serialized block data to the batch.
+// This avoids re-marshalling header and data when the caller already has the binary blobs.
+func (b *DefaultBatch) SaveBlockDataFromBytes(header *types.SignedHeader, headerBlob, dataBlob []byte, signature *types.Signature) error {
+	height := header.Height()


The height variable is already available from header.Height() and signatureHash from *signature. These lines are redundant and can be removed.

Suggested change

height := header.Height()

height := header.Height()

signatureHash := *signature

gemini-code-assist · 2026-02-19T15:52:02Z

pkg/store/keys.go

@@ -39,29 +39,40 @@ const (
 	heightPrefix    = "t"


The heightPrefix constant is defined but not used in the new heightKey function. It seems like the intention was to use it as the prefix, but it's hardcoded as 't' in the constant definition and then 'h', 'd', 'c', 's' in the heightKey calls. This could lead to confusion or incorrect key generation.

gemini-code-assist · 2026-02-19T15:52:02Z

pkg/store/keys.go

 func getHeightKey() string {
-	return GenerateKey([]string{heightPrefix})
+	return "/" + heightPrefix


The getHeightKey function now returns a hardcoded string "/t". This is inconsistent with the heightKey function which dynamically generates keys based on a prefix and height. If heightPrefix is intended to be used here, it should be passed to heightKey or directly used in the string concatenation.

Suggested change

func getHeightKey() string {

return GenerateKey([]string{heightPrefix})

return "/" + heightPrefix

func getHeightKey() string {

return heightKey(heightPrefix, 0) // Or whatever default height is appropriate

}

gemini-code-assist · 2026-02-19T15:52:02Z

pkg/store/kv.go

 func GenerateKey(fields []string) string {
-	key := "/" + strings.Join(fields, "/")
-	return path.Clean(key)
+	// Pre-calculate total size to avoid re-allocation.
+	n := 0
+	for _, f := range fields {
+		n += 1 + len(f) // '/' + field
+	}
+	var b strings.Builder
+	b.Grow(n)
+	for _, f := range fields {
+		b.WriteByte('/')
+		b.WriteString(f)
+	}
+	return b.String()


The GenerateKey function was optimized to use strings.Builder for performance. However, the path.Clean call was removed. While strings.Builder is more efficient for concatenation, path.Clean ensures that the generated key is always in a canonical form (e.g., handling double slashes or ./.. components). Removing it might lead to non-canonical keys if the input fields are not perfectly clean, which could cause issues with datastore lookups.

julienrbrt

nice! mostly lgtm! i think we can delete some old code and delete some struct cache to put them in cached_store instead (better for readability and consistency)

block/internal/executing/executor.go

julienrbrt · 2026-02-19T16:28:15Z

block/internal/executing/executor.go

+
+	// Cached per-block data to avoid store reads + protobuf deserialization
+	// in CreateBlock. Updated after each successful block production.
+	lastHeaderHash types.Hash


I still believe those aren't necessary, because GetBlockData is cached on the second call.
We may simply want to add GetSignature to the pkg/store/cached_store.go for consistency.

I replaced the code to use GetBlockData and ran benchmarks. It is slower with GetBlockData (using a memory DB).

This branch:

BenchmarkProduceBlock/empty_batch-12 43850 26704 ns/op 6758 B/op 70 allocs/op BenchmarkProduceBlock/empty_batch-12 45126 26834 ns/op 6854 B/op 70 allocs/op BenchmarkProduceBlock/empty_batch-12 47043 27658 ns/op 7074 B/op 70 allocs/op BenchmarkProduceBlock/empty_batch-12 45932 25491 ns/op 6985 B/op 70 allocs/op BenchmarkProduceBlock/empty_batch-12 46345 26730 ns/op 7016 B/op 70 allocs/op BenchmarkProduceBlock/single_tx-12 44605 27264 ns/op 7258 B/op 80 allocs/op BenchmarkProduceBlock/single_tx-12 44408 26924 ns/op 7252 B/op 80 allocs/op BenchmarkProduceBlock/single_tx-12 44490 27180 ns/op 7253 B/op 80 allocs/op BenchmarkProduceBlock/single_tx-12 44314 27036 ns/op 7234 B/op 80 allocs/op BenchmarkProduceBlock/single_tx-12 45469 27158 ns/op 7367 B/op 80 allocs/op BenchmarkProduceBlock/100_txs-12 36204 32863 ns/op 25767 B/op 80 allocs/op BenchmarkProduceBlock/100_txs-12 36198 32815 ns/op 25767 B/op 80 allocs/op BenchmarkProduceBlock/100_txs-12 37375 32799 ns/op 25745 B/op 80 allocs/op BenchmarkProduceBlock/100_txs-12 36219 32371 ns/op 25767 B/op 80 allocs/op BenchmarkProduceBlock/100_txs-12 36669 32604 ns/op 25758 B/op 80 allocs/op PASS ok github.com/evstack/ev-node/block/internal/executing 23.490s

With GetBlockData

BenchmarkProduceBlock/empty_batch-12 41150 28790 ns/op 8979 B/op 107 allocs/op BenchmarkProduceBlock/empty_batch-12 42402 28799 ns/op 8966 B/op 107 allocs/op BenchmarkProduceBlock/empty_batch-12 41312 29003 ns/op 8977 B/op 107 allocs/op BenchmarkProduceBlock/empty_batch-12 42162 29241 ns/op 8967 B/op 107 allocs/op BenchmarkProduceBlock/empty_batch-12 41576 28934 ns/op 8973 B/op 107 allocs/op BenchmarkProduceBlock/single_tx-12 41461 29483 ns/op 9471 B/op 120 allocs/op BenchmarkProduceBlock/single_tx-12 40598 29281 ns/op 9484 B/op 120 allocs/op BenchmarkProduceBlock/single_tx-12 41346 29488 ns/op 9472 B/op 120 allocs/op BenchmarkProduceBlock/single_tx-12 40634 28958 ns/op 9483 B/op 120 allocs/op BenchmarkProduceBlock/single_tx-12 38173 29275 ns/op 9523 B/op 120 allocs/op BenchmarkProduceBlock/100_txs-12 30086 39664 ns/op 39148 B/op 226 allocs/op BenchmarkProduceBlock/100_txs-12 30886 39006 ns/op 39126 B/op 226 allocs/op BenchmarkProduceBlock/100_txs-12 30547 39378 ns/op 39135 B/op 226 allocs/op BenchmarkProduceBlock/100_txs-12 31329 38776 ns/op 39115 B/op 226 allocs/op BenchmarkProduceBlock/100_txs-12 31376 38786 ns/op 39113 B/op 226 allocs/op PASS ok github.com/evstack/ev-node/block/internal/executing 23.685s

Ok, interesting. Have you made sure to call store.NewCachedStore on that memdb store?

EDIT: Checked with store.NewCachedStore and it is slightly slower still, a field access will probably always be faster 👍🏾

No, let me try again.

block/internal/executing/executor.go

julienrbrt · 2026-02-19T16:35:28Z

block/internal/executing/executor.go

+			lastSignature = e.lastSignature
+		} else {
+			// Cold start fallback: read from store
+			lastHeader, lastData, err := e.store.GetBlockData(ctx, height-1)


technically not true, without those cached struct, it would have read from cache anyway

types/state.go

… logic.

codecov · 2026-02-20T10:58:04Z

Codecov Report

❌ Patch coverage is 67.41573% with 58 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.91%. Comparing base (dc473e5) to head (985e9f9).
⚠️ Report is 5 commits behind head on main.

Files with missing lines	Patch %	Lines
block/internal/executing/executor.go	73.86%	17 Missing and 6 partials ⚠️
pkg/store/tracing.go	0.00%	11 Missing ⚠️
types/signer.go	62.50%	3 Missing and 3 partials ⚠️
types/data.go	0.00%	5 Missing ⚠️
types/serialization.go	73.33%	4 Missing ⚠️
block/internal/executing/pending.go	0.00%	3 Missing ⚠️
pkg/store/batch.go	70.00%	0 Missing and 3 partials ⚠️
pkg/store/keys.go	80.00%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3093      +/-   ##
==========================================
- Coverage   61.18%   60.91%   -0.27%     
==========================================
  Files         113      113              
  Lines       11444    11617     +173     
==========================================
+ Hits         7002     7077      +75     
- Misses       3654     3740      +86     
- Partials      788      800      +12

Flag	Coverage Δ
combined	`60.91% <67.41%> (-0.27%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

julienrbrt

👏🏾

alpe added 12 commits February 19, 2026 12:26

Introduce phase1 benchmark

7dea42f

Bench refactor

fc7f729

bench: fix monotonically-increasing timestamp + add 100-tx case

a4fb81f

shot1

97bacb4

x

64bc9c7

y

d04ef7a

cgo

eb1145c

z

2ca152f

Revert proto changes and interface

71f4589

Remove unecessay caches

1f17854

Save pending

1bfac03

test: add 100-transaction benchmark for block production and simplify…

08163ad

… executor comments.

github-actions bot assigned alpe Feb 19, 2026

gemini-code-assist bot reviewed Feb 19, 2026

View reviewed changes

Base automatically changed from alex/bench_phase1_start to main February 19, 2026 16:20

alpe changed the title ~~Alex/execute perf~~ perf: optimize block creation Feb 19, 2026

julienrbrt reviewed Feb 19, 2026

View reviewed changes

refactor: remove ValidateBlock method and simplify state validation…

6cf8b0d

… logic.

Extract last block info

985e9f9

alpe marked this pull request as ready for review February 20, 2026 14:02

alpe requested a review from julienrbrt February 20, 2026 14:02

julienrbrt approved these changes Feb 20, 2026

View reviewed changes

julienrbrt merged commit ce18484 into main Feb 20, 2026
35 checks passed

julienrbrt deleted the alex/execute_perf branch February 20, 2026 14:47

tac0turtle mentioned this pull request Feb 22, 2026

perf: optimize encoding and decoding flows #2894

Closed

	signature, _, err := e.signHeader(&header.Header)
	signature, headerBytes, err := e.signHeader(&header.Header)

	height := header.Height()
	height := header.Height()
	signatureHash := *signature

Comments

Conversation

alpe commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance Optimizations Analysis (alex/bench_phase1_start vs HEAD)

1. Executor: In-Memory Caching for Block Production

2. Executor: Avoid Re-hashing

3. Executor: Optimized Metrics Collection

4. Types: Cached Public Key Marshalling

5. Types: Lightweight Sequence Validation

6. Store/Executor: Pre-serialized writes (SaveBlockDataFromBytes)

7. Protobuf: Zero-Allocation Timestamp Conversion

8. Executor: Sequential P2P Broadcast

Uh oh!

github-actions bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Feb 19, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

claude bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

Tasks

Bugs / Correctness

Thread Safety

Design / Correctness Concerns

Code Quality

Performance (Minor)

Testing

Summary

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

julienrbrt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

julienrbrt Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

alpe Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

julienrbrt Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

julienrbrt Feb 20, 2026

Choose a reason for hiding this comment

Uh oh!

alpe commented Feb 19, 2026 •

edited

Loading

6. Store/Executor: Pre-serialized writes (`SaveBlockDataFromBytes`)

github-actions bot commented Feb 19, 2026 •

edited

Loading

claude bot commented Feb 19, 2026 •

edited

Loading

alpe Feb 20, 2026 •

edited

Loading

codecov bot commented Feb 20, 2026 •

edited

Loading

julienrbrt left a comment •

edited

Loading