fix: SIGSEGV in UNION_ALL query with dedup#622
Merged
Merged
Conversation
…projects a duplicate property column A UNION arm that projects the same scan-derived property expression more than once (e.g. `RETURN b.age, b.age`) causes the schema's insertToScopeMayRepeat / insertToGroupAndScopeMayRepeat to deduplicate the entry. As a result, getExpressionsInScope() on that child is shorter than the union arity. LogicalUnion::requireFlatExpression and getGroupsPosToFlatten used positional indexing into getExpressionsInScope(), causing an out-of-bounds vector read when an arm had duplicate projections -> garbage Expression pointer -> SIGSEGV. Fix: store each child's non-deduplicated projection list in LogicalUnion (childProjections) and use it for positional access instead of the deduplicated schema in-scope list. childProjections is captured before appending flattens, and is preserved across copy(). The physical mapper (map_union.cpp) also used the deduplicated in-scope list to build the factorized table for each child, producing a table with fewer columns than the union arity. This caused column index mismatch during the union scan. Fix: use childProjections instead, so the table has the full arity with duplicate columns as needed. Both fixes are coordinated since fixing only the planner would still leave the physical mapper crashing (out-of-bounds table column read), and fixing only the mapper would still leave the planner crashing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes: #620
A UNION arm that projects the same scan-derived property expression more than once (e.g.
RETURN b.age, b.age) causes the schema's insertToScopeMayRepeat / insertToGroupAndScopeMayRepeat to deduplicate the entry. As a result, getExpressionsInScope() on that child is shorter than the union arity.LogicalUnion::requireFlatExpression and getGroupsPosToFlatten used positional indexing into getExpressionsInScope(), causing an out-of-bounds vector read when an arm had duplicate projections -> garbage Expression pointer -> SIGSEGV.
Fix: store each child's non-deduplicated projection list in LogicalUnion (childProjections) and use it for positional access instead of the deduplicated schema in-scope list. childProjections is captured before appending flattens, and is preserved across copy().
The physical mapper (map_union.cpp) also used the deduplicated in-scope list to build the factorized table for each child, producing a table with fewer columns than the union arity. This caused column index mismatch during the union scan. Fix: use childProjections instead, so the table has the full arity with duplicate columns as needed.
Both fixes are coordinated since fixing only the planner would still leave the physical mapper crashing (out-of-bounds table column read), and fixing only the mapper would still leave the planner crashing.