Skip to content

Conversation

@ChrisDryden
Copy link
Collaborator

This should solve the issue: #10250 where for large files split can run out of memory. I was able to use the built in integration test support to limit the resources that the test has to mock a scenario where the file is larger than the available memory to trigger the conditions for the bug.

@codspeed-hq
Copy link

codspeed-hq bot commented Jan 14, 2026

Merging this PR will improve performance by ×4.6

⚡ 4 improved benchmarks
✅ 278 untouched benchmarks
⏩ 38 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Memory split_number_chunks 1,148.1 KB 249.9 KB ×4.6
Memory sort_numeric[500000] 48.7 MB 47.3 MB +3.07%
Memory sort_key_field[500000] 32.8 MB 31.4 MB +4.65%
Simulation split_number_chunks 290.2 µs 258 µs +12.47%

Comparing ChrisDryden:fix-split-number-memory-issue (694c8a2) with main (8a7c3fb)

Open in CodSpeed

Footnotes

  1. 38 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@sylvestre
Copy link
Contributor

@ChrisDryden what is your take on the perf regression ?

@ChrisDryden
Copy link
Collaborator Author

I don't think the perf regression is acceptable, right now io::copy uses a 8kb buffer which is causing the regression. I see head uses a 64kb buffer instead, I'll try seeing if the increased buffer size will mitigate the perf regression

@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/shuf/shuf-reservoir (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/sort/sort-stale-thread-mem (passes in this run but fails in the 'main' branch)

@ChrisDryden
Copy link
Collaborator Author

Much better now, solved the original issue and there's huge memory and performance improvements. The original fix had a 8kb default buffer and testing locally I found it hit diminishing returns at 128kb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants