Skip to content

[Json] Use partition and take in RunEndEncoded decoder#9658

Open
liamzwbao wants to merge 2 commits intoapache:mainfrom
liamzwbao:issue-9645-ree-optimization
Open

[Json] Use partition and take in RunEndEncoded decoder#9658
liamzwbao wants to merge 2 commits intoapache:mainfrom
liamzwbao:issue-9645-ree-optimization

Conversation

@liamzwbao
Copy link
Copy Markdown
Contributor

@liamzwbao liamzwbao commented Apr 3, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Optimize RunEndEncoded decoder to use partition and take, substantially improving performance (over 2x speedup).

Are these changes tested?

Covered by existing tests

Are there any user-facing changes?

No

@github-actions github-actions bot added the arrow Changes to the arrow crate label Apr 3, 2026
@liamzwbao liamzwbao force-pushed the issue-9645-ree-optimization branch from 568cceb to ffd5849 Compare April 3, 2026 21:46
@liamzwbao liamzwbao marked this pull request as ready for review April 3, 2026 21:49
@liamzwbao
Copy link
Copy Markdown
Contributor Author

Local benchmark results

Benchmarking decode_short_ree_runs_json/131072: Collecting 50 samples in estimated 5.1127 s (650 ite
decode_short_ree_runs_json/131072
        time:   [7.7892 ms 7.8779 ms 7.9584 ms]
        thrpt:  [217.23 MiB/s 219.45 MiB/s 221.95 MiB/s]
 change:
        time:   [−49.865% −49.220% −48.647%] (p = 0.00 < 0.05)
        thrpt:  [+94.730% +96.929% +99.463%]
        Performance has improved.

Benchmarking decode_long_ree_runs_json/131072: Collecting 50 samples in estimated 5.1862 s (700 iter
decode_long_ree_runs_json/131072
        time:   [6.5677 ms 6.6715 ms 6.7831 ms]
        thrpt:  [223.96 MiB/s 227.71 MiB/s 231.31 MiB/s]
 change:
        time:   [−56.851% −56.093% −55.362%] (p = 0.00 < 0.05)
        thrpt:  [+124.03% +127.75% +131.75%]
        Performance has improved.

Benchmarking decode_short_ree_runs_serialize: Collecting 100 samples in estimated 5.8899 s (600 iter
decode_short_ree_runs_serialize
        time:   [8.9996 ms 9.0555 ms 9.1159 ms]
        change: [−49.254% −48.848% −48.441%] (p = 0.00 < 0.05)
        Performance has improved.

Benchmarking decode_long_ree_runs_serialize: Collecting 100 samples in estimated 5.2928 s (600 itera
decode_long_ree_runs_serialize
        time:   [8.4019 ms 8.4652 ms 8.5338 ms]
        change: [−52.747% −52.330% −51.944%] (p = 0.00 < 0.05)
        Performance has improved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Json] RunEndEncoded decoder optimization

1 participant