Skip to content

fix: SIGSEGV in UNION_ALL query with dedup#622

Merged
adsharma merged 1 commit into
mainfrom
ladybug-620
Jun 26, 2026
Merged

fix: SIGSEGV in UNION_ALL query with dedup#622
adsharma merged 1 commit into
mainfrom
ladybug-620

Conversation

@adsharma

Copy link
Copy Markdown
Contributor

Fixes: #620

A UNION arm that projects the same scan-derived property expression more than once (e.g. RETURN b.age, b.age) causes the schema's insertToScopeMayRepeat / insertToGroupAndScopeMayRepeat to deduplicate the entry. As a result, getExpressionsInScope() on that child is shorter than the union arity.

LogicalUnion::requireFlatExpression and getGroupsPosToFlatten used positional indexing into getExpressionsInScope(), causing an out-of-bounds vector read when an arm had duplicate projections -> garbage Expression pointer -> SIGSEGV.

Fix: store each child's non-deduplicated projection list in LogicalUnion (childProjections) and use it for positional access instead of the deduplicated schema in-scope list. childProjections is captured before appending flattens, and is preserved across copy().

The physical mapper (map_union.cpp) also used the deduplicated in-scope list to build the factorized table for each child, producing a table with fewer columns than the union arity. This caused column index mismatch during the union scan. Fix: use childProjections instead, so the table has the full arity with duplicate columns as needed.

Both fixes are coordinated since fixing only the planner would still leave the physical mapper crashing (out-of-bounds table column read), and fixing only the mapper would still leave the planner crashing.

…projects a duplicate property column

A UNION arm that projects the same scan-derived property expression more than
once (e.g. `RETURN b.age, b.age`) causes the schema's insertToScopeMayRepeat /
insertToGroupAndScopeMayRepeat to deduplicate the entry. As a result,
getExpressionsInScope() on that child is shorter than the union arity.

LogicalUnion::requireFlatExpression and getGroupsPosToFlatten used positional
indexing into getExpressionsInScope(), causing an out-of-bounds vector read
when an arm had duplicate projections -> garbage Expression pointer -> SIGSEGV.

Fix: store each child's non-deduplicated projection list in LogicalUnion
(childProjections) and use it for positional access instead of the deduplicated
schema in-scope list. childProjections is captured before appending flattens,
and is preserved across copy().

The physical mapper (map_union.cpp) also used the deduplicated in-scope list
to build the factorized table for each child, producing a table with fewer
columns than the union arity. This caused column index mismatch during the
union scan. Fix: use childProjections instead, so the table has the full
arity with duplicate columns as needed.

Both fixes are coordinated since fixing only the planner would still leave the
physical mapper crashing (out-of-bounds table column read), and fixing only
the mapper would still leave the planner crashing.
@adsharma adsharma merged commit 26be67f into main Jun 26, 2026
4 checks passed
@adsharma adsharma deleted the ladybug-620 branch June 26, 2026 00:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SIGSEGV: out-of-bounds in LogicalUnion::requireFlatExpression when a UNION arm projects a duplicated property column

1 participant