-
Notifications
You must be signed in to change notification settings - Fork 3.7k
branch-4.0: [fix](cloud) fix packed file warmup cannot read small files (#60160) #60207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…0160) 1. Use rs_meta.fs() instead of storage_resource.value()->fs in warm up functions to support packed files. PackedFileSystem wrapper in rs_meta.fs() handles the index_map lookup and reads from the correct packed file path with proper offset. Without this fix, warm up would try to directly open the segment/index path which does not exist on S3 because the data is actually stored in a packed file. Fixed locations: - CloudWarmUpManager::handle_jobs(): segment and inverted index download - CloudInternalServiceImpl::warm_up_rowset(): segment and inverted index download 2. Fix packed_file_manager to use correct TTL cache type when writing small files to file cache. Previously it always used FileCacheType::NORMAL, causing ttl_cache_size mismatch between source and target clusters during warm up. Now expiration_time is passed through PackedAppendContext and used to set the correct cache type (TTL when expiration_time > 0, NORMAL otherwise).
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes issues with packed file warmup that prevented reading small files. It implements two key fixes: (1) using rs_meta.fs() instead of storage_resource.value()->fs to support packed files through the PackedFileSystem wrapper, and (2) ensuring the correct TTL cache type is used when writing small files to file cache.
Changes:
- Updated warmup code paths to use
rs_meta.fs()which includes PackedFileSystem wrapper for handling packed files - Added
expiration_timetoPackedAppendContextand propagated it through the file cache write path - Added regression test to validate packed file warmup with TTL cache verification
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| test_packed_file_warm_up_cluster_event.groovy | New regression test for packed file warmup with event-driven sync and TTL cache validation |
| rowset_writer_context.h | Added expiration_time calculation for PackedAppendContext based on file_cache_ttl_sec and newest_write_timestamp |
| packed_file_manager.h | Added expiration_time field to PackedAppendContext struct |
| packed_file_manager.cpp | Updated cache write functions to use expiration_time for determining correct cache type (TTL vs NORMAL) |
| cloud_warm_up_manager.cpp | Changed from storage_resource.value()->fs to rs->fs() for segment and index downloads to support packed files |
| cloud_internal_service.cpp | Changed from storage_resource.value()->fs to rs_meta.fs() for segment and index downloads to support packed files |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
skip buildall |
pick: #60160