Skip to content

split state storage on partitions (step 1)#946

Closed
SmaGMan wants to merge 8 commits intomasterfrom
feat/split-state-storage
Closed

split state storage on partitions (step 1)#946
SmaGMan wants to merge 8 commits intomasterfrom
feat/split-state-storage

Conversation

@SmaGMan
Copy link
Copy Markdown
Member

@SmaGMan SmaGMan commented Nov 3, 2025

RATIONALE

Support local state partitions - shard accounts cells tree is split into partitions by shards at the configured split_depth. The top of the state tree is stored in the main cells db, and partitions subtrees are stored each in separate physical databases that can be placed on separate disks.


Pull Request Checklist

NODE CONFIGURATION MODEL CHANGES

[Yes]

Added core_storage.state_parts

...
"core_storage": {
  ...
  "state_parts": {
    "split_depth": N, // depth of the state split on 2^N partitions
    "part_dirs" : {
      // key - hex representation of partition shard prefix; value - path to partition database
      "a000000000000000": "path/to/cells-part-a000000000000000",
      ...
    }
  }
  ...
}
...
  • when split_depth: 0 - no partitions used;
  • we can set a custom database path, even only for one partition, so we can move only one database to a separate disk if required;
  • if the path to partition database is not specified, then the relative path used "cells-parts/cells-part-{shard prefix hex}".

Default value is state_parts: null that means no partitions.

BLOCKCHAIN CONFIGURATION MODEL CHANGES

[None]


COMPATIBILITY

Affected features:

  • [State]
  • [Storage. Blocks]
  • [Storage. States]

Fully compatible.

State will be saved with partitions if they are specified in config. If state was saved with partitions it will be read with partitions. If it was saved without partitions (e.g. before update) it will be read without.

Partitions map {key - cell hash: value - shard prefix} will be saved to CellsDB.shard_states right after root cell hash. If no partitions used nothing will be added to ShardStates value. So existing values in ShardStates table will be treated as "no partitions used".

A new flag HAS_STATE_PARTS = 1 << 13 added to the BlockHandle bit flags. It means that no partitions were used / or all required state partitions were successfully stored in separate storages. Now BlockHandle.has_state() returns true only when both new flag and old one HAS_STATE_MAIN = 1 << 3 are set. The migration script (0.0.4 -> 0.0.5) set HAS_STATE_PARTS for all existing block handles.

BUT partitions configuration changes (e.g. from 4 to 8 partitions, or from 8 to 2 or 0) are not auto compatible. Will be implemented in a separate task.

Manual compatibility tests were passed:

  • set core_storage.state_parts = null or remove param from config
  • build last master version
  • gen local network
just gen_network 1 --force
  • run node
 just node 1
  • run 20k transfers test
 ./transfers-20k.sh
  • stop node
  • build feat/split-state-storage barch version
  • run node without reset
 just node 1
  • see successful core db migration to 0.0.5 in logs
  • continue 20k transfers test, ensure all is going well
 ./transfers-20k.sh --continue
  • stop node
  • set up 4 partitions in .temp/config1.json
...
  "state_parts" : {
    "split_depth": 2
  }
...
  • run node without reset
 just node 1
  • continue 20k transfers test, ensure all is going well
 ./transfers-20k.sh --continue
  • stop node
  • move some partitions databases
 mkdir .temp/db1/cells-parts-moved
 mv .temp/db1/cells-parts/cells-part-a000000000000000 .temp/db1/cells-parts-moved/
 mv .temp/db1/cells-parts/cells-part-6000000000000000 .temp/db1/cells-parts-moved/
  • set up paths to moved databases in .temp/config1.json
...
  "state_parts" : {
    "split_depth": 2,
	"part_dirs": {
	  "a000000000000000": "/workspace/tycho/.temp/db1/cells-parts-moved/cells-part-a000000000000000",
      "6000000000000000": "cells-parts-moved/cells-part-6000000000000000"
	}
  }
...
  • run node without reset
 just node 1
  • continue 20k transfers test, ensure all is going well
 ./transfers-20k.sh --continue

SPECIAL DEPLOYMENT ACTIONS

[Not Required]

Without additional changes in the node config it works with a single partition without split.


PERFORMANCE IMPACT

[Expected impact]

  • Better perfomance of non-zero states (~20-30%)
    • master: degradation on 20k transfers from empty to 30kk state: from ~35k tps to ~15-20k tps
image image
  • 4 local partitions: degradation on 20k transfers from empty to 30kk state: from ~35k tps to ~20-30k tps
image image
  • No states GC lag growth
  • Faster state store

TESTS

Unit Tests

[No coverage]

Network Tests

[No coverage]

Manual Tests

Performance testing:

  • 20k transfers
  • 30k transfers
  • deploy 30kk accounts
  • 20k transfers
  • 30k transfers

(metrics are in the PERFORMANCE IMPACT block)

@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 3, 2025

🧪 Network Tests

To run network tests for this PR, use:

gh workflow run network-tests.yml -f pr_number=946

Available test options:

  • Run all tests: gh workflow run network-tests.yml -f pr_number=946
  • Run specific test: gh workflow run network-tests.yml -f pr_number=946 -f test_selection=ping-pong

Test types: destroyable, ping-pong, one-to-many-internal-messages, fq-deploy, nft-index, persistent-sync

Results will be posted as workflow runs in the Actions tab.

@codecov
Copy link
Copy Markdown

codecov bot commented Nov 3, 2025

Codecov Report

❌ Patch coverage is 36.59023% with 636 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.10%. Comparing base (e7a2152) to head (c303ac7).
⚠️ Report is 45 commits behind head on master.

Files with missing lines Patch % Lines
core/src/storage/shard_state/mod.rs 24.77% 326 Missing and 11 partials ⚠️
core/src/storage/shard_state/cell_storage.rs 41.95% 115 Missing and 4 partials ⚠️
core/src/storage/db.rs 15.78% 79 Missing and 1 partial ⚠️
block-util/src/block/mod.rs 8.33% 33 Missing ⚠️
core/src/storage/mod.rs 36.58% 23 Missing and 3 partials ⚠️
core/src/storage/shard_state/store_state_raw.rs 5.88% 16 Missing ⚠️
block-util/src/dict.rs 86.88% 6 Missing and 2 partials ⚠️
core/src/storage/config.rs 89.23% 3 Missing and 4 partials ⚠️
collator/src/collator/execution_manager.rs 0.00% 5 Missing ⚠️
storage/src/context.rs 77.77% 1 Missing and 3 partials ⚠️
... and 1 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #946      +/-   ##
==========================================
- Coverage   46.25%   46.10%   -0.16%     
==========================================
  Files         335      335              
  Lines       60584    61340     +756     
  Branches    60584    61340     +756     
==========================================
+ Hits        28026    28283     +257     
- Misses      31112    31591     +479     
- Partials     1446     1466      +20     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@SmaGMan SmaGMan linked an issue Nov 3, 2025 that may be closed by this pull request
@SmaGMan SmaGMan force-pushed the feat/split-state-storage branch from e01f7f5 to 804fc23 Compare November 4, 2025 00:01
@SmaGMan SmaGMan force-pushed the feat/split-state-storage branch from 804fc23 to c303ac7 Compare November 4, 2025 09:48
@SmaGMan SmaGMan self-assigned this Nov 4, 2025
@SmaGMan
Copy link
Copy Markdown
Member Author

SmaGMan commented Dec 19, 2025

included in #983

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant