-
Notifications
You must be signed in to change notification settings - Fork 89
feat(core): Adopt two-phase construction for clp::ReadOnlyMemoryMappedFile; Remove boost::iostreams dependency in clp-s.
#1963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core): Adopt two-phase construction for clp::ReadOnlyMemoryMappedFile; Remove boost::iostreams dependency in clp-s.
#1963
Conversation
clp::ReadOnlyMemoryMappedFile:
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughReplaces ReadOnlyMemoryMappedFile's throwing constructor with a static create(std::string_view) factory that returns ystdlib::error_handling::Result; updates callers to handle result-based mapping, switches internal ownership to optional, and removes Boost::iostreams linkage where mapping wrapper is used. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
components/core/src/clp_s/indexer/CMakeLists.txt (1)
115-116:⚠️ Potential issue | 🔴 CriticalMissing compilation units:
ReadOnlyMemoryMappedFile.cppandFileDescriptor.cppnot in indexer build.The indexer compiles
../ZstdDecompressor.cpp(line 115) and includes../../clp/streaming_compression/zstd/Decompressor.cpp(line 69), both of which depend onclp::ReadOnlyMemoryMappedFile. SinceReadOnlyMemoryMappedFile.cppcontains real implementations (mmap/munmap calls) and depends onFileDescriptor.cpp, both source files must be compiled for the indexer target.Currently, neither
../../clp/ReadOnlyMemoryMappedFile.cppnor../../clp/FileDescriptor.cppare inINDEXER_SOURCES, and the indexer does not linkclp_s::clp_dependencies(which contains these files). This will cause linker errors.Add both source files to
INDEXER_SOURCESor link the indexer againstclp_s::clp_dependencies.components/core/src/clp_s/ZstdDecompressor.cpp (1)
244-271:⚠️ Potential issue | 🟠 MajorErrno-based error handling on mmap failure looks correct.
The new
open(std::string const&)properly:
- Constructs the
ReadOnlyMemoryMappedFileviamake_unique- Checks
is_open()using thefalse ==pattern per coding guidelines- Reports the errno with
strerroron failure- Configures the input stream from
get_view()One concern: on line 248,
m_input_typeis set toMemoryMappedCompressedFilebefore the open check on line 252. Ifis_open()returnsfalse, the method returnsErrorCodeFailurewithm_input_typestill set toMemoryMappedCompressedFile, leaving the object in a partially-initialized state. A subsequent call to any otheropen()overload would throwErrorCodeNotReadyinstead of succeeding. The caller must explicitly callclose()to recover. Consider resettingm_input_typeback toNotInitializedon the error path.Proposed fix
m_memory_mapped_file = std::make_unique<clp::ReadOnlyMemoryMappedFile>(compressed_file_path); if (false == m_memory_mapped_file->is_open()) { auto const error_number{m_memory_mapped_file->get_errno()}; SPDLOG_ERROR( "ZstdDecompressor: Unable to memory map the compressed file with path: {}. errno = " "{} ({}).", compressed_file_path.c_str(), error_number, std::strerror(error_number) ); + m_memory_mapped_file.reset(); + m_input_type = InputType::NotInitialized; return ErrorCodeFailure; }
🤖 Fix all issues with AI agents
In `@components/core/src/clp_s/ZstdDecompressor.cpp`:
- Line 8: Remove the stale Boost header include
"<boost/iostreams/device/mapped_file.hpp>" from ZstdDecompressor.cpp; this
include is no longer used because memory-mapping is handled via
clp::ReadOnlyMemoryMappedFile, so delete the include line and ensure any
lingering references to boost::iostreams::mapped_file are removed or replaced
with clp::ReadOnlyMemoryMappedFile usage (check code in ZstdDecompressor class
and related functions to confirm compilation).
In `@components/core/src/clp/ReadOnlyMemoryMappedFile.cpp`:
- Around line 34-38: The log call in ReadOnlyMemoryMappedFile uses path.data()
(a std::string_view) which may not be null-terminated; change the SPDLOG_ERROR
invocation to pass the std::string_view itself (i.e., path) or explicitly
construct a std::string (e.g., std::string(path)) so the formatting treats it
safely, keeping the same message and ex.what() usage.
- Around line 14-40: The constructor currently returns early for empty files
leaving m_data==nullptr and m_errno==0 which makes is_open() report failure;
change the ReadOnlyMemoryMappedFile constructor to treat an empty file as a
successful open by explicitly marking the object open: set m_buf_size = 0,
m_data = nullptr (as now) and set a success flag (e.g. m_is_open = true) or
adjust is_open() to check an explicit "opened successfully" state instead of
m_data; ensure get_errno() remains 0 for this case and update any code paths
that rely on m_data to handle zero-length buffers (references:
ReadOnlyMemoryMappedFile::ReadOnlyMemoryMappedFile, is_open(), m_errno, m_data,
m_buf_size, get_errno()).
In `@components/core/src/clp/ReadOnlyMemoryMappedFile.hpp`:
- Around line 46-51: The current is_open() implementation (checking m_data,
m_buf_size, and m_errno) treats a successfully opened zero-byte file as not
open; decide and make semantics explicit: either update the doc comment above
is_open() to state "Returns false for empty files since mmap cannot map
zero-length regions" so callers (which reference m_buf_size) don't log
strerror(0) on empty files, or change is_open() to treat
zero-length-but-successful opens as open by using: return m_errno == 0 &&
(m_data != nullptr || m_buf_size == 0); also update callers that log errors to
check m_buf_size==0 and avoid emitting strerror(0) for empty-file cases. Ensure
references to is_open(), m_data, m_buf_size, and m_errno are adjusted
consistently.
LinZhihao-723
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're here, I'd strongly prefer to rewrite this API with result-style error handling for better maintainability. What you will need to do:
- Enable move semantics:
- Add a move constructor which swap the underlying pointer with
nullptr. - In the destructor, if the data is
nullptrthen directly return.
- Add a move constructor which swap the underlying pointer with
- Make the constructor private, and use factory function design pattern that returns a result object indicating the failure.
- For errno, you should be able to directly convert it into std::errc which is
std::error_codecompatible.
- For errno, you should be able to directly convert it into std::errc which is
- Drop
is_openmethod. - Use move semantics without introducing unique pointers.
…ole warning with dev server (fixes y-scope#1892). (y-scope#1949)
…ter startup feedback (fixes y-scope#1495). (y-scope#1940)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@components/core/src/clp/ReadOnlyMemoryMappedFile.cpp`:
- Around line 32-34: The catch block in ReadOnlyMemoryMappedFile.cpp currently
reads errno during stack unwinding (catching TraceableException) which may be
stale; change the flow so FileDescriptor::OperationFailed captures errno at
throw-time (e.g., store the errno value inside the exception object or
ErrorCode_errno payload) and then, in the catch handler for TraceableException
in ReadOnlyMemoryMappedFile (and any similar handlers), use the stored errno
value from the exception (accessor on TraceableException or OperationFailed) to
construct the std::error_code instead of reading the global errno; update the
OperationFailed constructor to accept/capture ::errno and expose it via a method
or member that the catch site can call.
In `@components/core/src/clp/ReadOnlyMemoryMappedFile.hpp`:
- Around line 40-46: The move-assignment operator for ReadOnlyMemoryMappedFile
currently overwrites m_data and m_buf_size without unmapping the existing
mapping, leaking the mmap; update operator=(ReadOnlyMemoryMappedFile&&) to first
check if m_data != nullptr and call munmap(m_data, m_buf_size) (or the class's
unmap helper) on the current object before taking ownership from rhs, then move
m_data and m_buf_size using std::exchange as before; ensure munmap is available
by adding `#include` <sys/mman.h> to the header or move the operator
implementation into the .cpp where munmap is already included.
In `@components/core/src/clp/streaming_compression/zstd/Decompressor.cpp`:
- Around line 178-192: The SPDLOG_ERROR call inside Decompressor::create (the
block that handles ReadOnlyMemoryMappedFile::create failure where
m_memory_mapped_file is set and file_view is obtained) uses the prefix
"ZstdDecompressor"; update that log message to use the consistent prefix
"streaming_compression::zstd::Decompressor" to match other logs in this file
(e.g., the messages near lines with SPDLOG_ERROR/SPDLOG_WARN at the top of
Decompressor) so all log entries reference
streaming_compression::zstd::Decompressor.
components/core/src/clp/streaming_compression/zstd/Decompressor.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@components/core/src/clp/ReadOnlyMemoryMappedFile.hpp`:
- Around line 34-37: The [[nodiscard]] attribute on the deleted copy-assignment
operator is redundant; remove the attribute so the declaration reads simply
"auto operator=(ReadOnlyMemoryMappedFile const&) -> ReadOnlyMemoryMappedFile& =
delete;" (or even "ReadOnlyMemoryMappedFile& operator=(ReadOnlyMemoryMappedFile
const&) = delete;") next to the already-deleted copy constructor in the
ReadOnlyMemoryMappedFile class to clean up dead code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@components/core/src/clp/ReadOnlyMemoryMappedFile.hpp`:
- Around line 43-53: The move-assignment operator in ReadOnlyMemoryMappedFile
uses munmap but the header lacks `#include` <sys/mman.h>, causing compilation when
included transitively; either add `#include` <sys/mman.h> to this header or move
the operator=(ReadOnlyMemoryMappedFile&&) implementation into the .cpp where
sys/mman.h is already included, and while updating, make the null check style
consistent by using nullptr != m_data (instead of m_data != nullptr) and keep
the uses of m_data and m_buf_size intact when calling munmap and exchanging
values from rhs.
clp::ReadOnlyMemoryMappedFile:clp::ReadOnlyMemoryMappedFile; Remove boost::iostreams from clp-s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
components/core/src/clp_s/ZstdDecompressor.cpp (1)
243-269:⚠️ Potential issue | 🟡 Minor
m_input_typeis set beforecreate()— failed open leaves the object in a partially-initialized state.If
ReadOnlyMemoryMappedFile::createfails (line 251),m_input_typeis already set toMemoryMappedCompressedFile(line 247), but the function returnsErrorCodeFailurewithout resetting it. A subsequent call toopen()will throwErrorCodeNotReadybecausem_input_type != NotInitialized. The caller must explicitlyclose()before retrying.This same pattern exists in
clp::streaming_compression::zstd::Decompressor::open(line 175 of that file), so it's a pre-existing convention. However, resettingm_input_typeon the error path would make the API more forgiving:Proposed fix (applies to both decompressors)
if (result.has_error()) { SPDLOG_ERROR( "ZstdDecompressor: Unable to memory map the compressed file with path: {}. Error: " "{}.", compressed_file_path.c_str(), result.error().message() ); + m_input_type = InputType::NotInitialized; return ErrorCodeFailure; }
components/core/src/clp/streaming_compression/zstd/Decompressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/zstd/Decompressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/zstd/Decompressor.cpp
Outdated
Show resolved
Hide resolved
LinZhihao-723
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small catch otherwise we should be good to go.
Co-authored-by: Lin Zhihao <[email protected]>
LinZhihao-723
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the PR title, how about:
feat(core): Adopt two-phase construction for `clp::ReadOnlyMemoryMappedFile`; Remove `boost::iostreams` dependency in clp-s.
- This is not just a refactor PR anymore. It changes the API and drops dependencies.
- The scope should be
core.
clp::ReadOnlyMemoryMappedFile; Remove boost::iostreams from clp-s.clp::ReadOnlyMemoryMappedFile; Remove boost::iostreams dependency in clp-s.
Description
This PR follows #445 and #450.
The
clp-ffi-jsproject will need to supportclp-scode in the near future. Removingboost::iostreamsfromclp-swill help simplify the dependency resolution on the ffi side.However,
clp::ReadOnlyMemoryMappedFileusedclp::TraceableExceptionandclp::ErrorCodeinstead of theclp-sversions, resulting in unnecessary complexities when implementing error handling. Hence, we switch to the preferred rust-style error handling withystdlib::error_handling::Resultforclp::ReadOnlyMemoryMappedFile.Checklist
breaking change.
Validation performed
Summary by CodeRabbit
Refactor
API
Tests