[flink] Fix COUNT(column) aggregate pushdown to reject nullable columns by beryllw · Pull Request #3271 · apache/fluss

beryllw · 2026-05-08T07:58:37Z

Purpose

Linked issue: close #3270

Fixes the aggregate pushdown logic to correctly handle COUNT(column) on nullable columns.

Brief change log

When the aggregate is CountAggFunction with a nullable column argument, the pushdown is rejected so that Flink handles the NULL-excluding count correctly.

Tests

FlinkTableSourceBatchITCase#testCountPushDownForPkTable
FlinkTableSourceBatchITCase#testCountPushDownForLogTable

API and Format

Documentation

beryllw · 2026-05-08T08:27:25Z

@luoyuxia cc

Copilot

Pull request overview

This PR fixes Flink aggregate pushdown in FlinkTableSource so that COUNT(column) is only pushed down to Fluss’ row-count API when the argument cannot be NULL (avoiding incorrect results for nullable columns, per #3270).

Changes:

Update aggregate pushdown logic to reject COUNT(expr) when the COUNT argument’s type is nullable.
Extend batch IT coverage to exercise COUNT(id) pushdown and to ensure COUNT(address) on a nullable column is not pushed down.
Modify log-table test data to include NULLs in the address column.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/source/FlinkTableSource.java`	Tightens COUNT aggregate pushdown to avoid pushing down nullable-argument `COUNT(expr)` as a row-count.
`fluss-flink/fluss-flink-common/src/test/java/org/apache/fluss/flink/source/FlinkTableSourceBatchITCase.java`	Adds/extends IT assertions around COUNT pushdown behavior and adjusts log-table test data to include NULL addresses.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+        // test COUNT(column) with NULL values - should NOT push down for nullable columns
+        // This will fail because log table doesn't support full scan in batch mode
+        assertThatThrownBy(
+                        () ->
+                                tEnv.explainSql(
+                                        String.format("SELECT COUNT(address) FROM %s", tableName)))
+                .hasMessageContaining(
+                        "Currently, Fluss only support queries on table with datalake enabled or point queries on primary key when it's in batch execution mode.");
+        assertThatThrownBy(
+                        () ->
+                                tEnv.explainSql(
+                                        String.format(
+                                                "SELECT COUNT(DISTINCT address) FROM %s",
+                                                tableName)))
+                .hasMessageContaining(
+                        "Currently, Fluss only support queries on table with datalake enabled or point queries on primary key when it's in batch execution mode.");


+        // For COUNT(column), reject if column is nullable (cannot handle NULL filtering)
+        if (isCountAgg) {
+            List<org.apache.flink.table.expressions.Expression> args = aggExpr.getChildren();
+            if (!args.isEmpty() && args.get(0) instanceof ResolvedExpression) {
+                ResolvedExpression arg = (ResolvedExpression) args.get(0);
+                if (arg.getOutputDataType().getLogicalType().isNullable()) {
+                    return false;
+                }
+            }
+        }


luoyuxia · 2026-05-09T07:22:35Z

cc @loserwang1024

loserwang1024 · 2026-05-11T02:14:36Z

@beryllw test fails

beryllw · 2026-05-11T05:54:35Z

@beryllw test fails

fixed

loserwang1024 · 2026-05-12T02:22:40Z

When the aggregate is CountAggFunction with a nullable column argument, the pushdown is rejected so that Flink handles the NULL-excluding count correctly.

Is this a bug in flink sql? Maybe we can take a flink jira for it?

beryllw · 2026-05-12T03:55:10Z

Is this a bug in flink sql? Maybe we can take a flink jira for it?

It's a Flink Fluss SQL Source bug.

[flink] Fix COUNT(column) aggregate pushdown to reject nullable columns

d1dd5ce

luoyuxia requested a review from Copilot May 8, 2026 08:45

Copilot started reviewing on behalf of luoyuxia May 8, 2026 08:46 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

fix ci

4b808dd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flink] Fix COUNT(column) aggregate pushdown to reject nullable columns#3271

[flink] Fix COUNT(column) aggregate pushdown to reject nullable columns#3271
beryllw wants to merge 2 commits into
apache:mainfrom
beryllw:flink-count-bugfix

beryllw commented May 8, 2026

Uh oh!

beryllw commented May 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

luoyuxia commented May 9, 2026

Uh oh!

loserwang1024 commented May 11, 2026

Uh oh!

beryllw commented May 11, 2026

Uh oh!

loserwang1024 commented May 12, 2026 •

edited

Loading

Uh oh!

beryllw commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

beryllw commented May 8, 2026

Purpose

Brief change log

Tests

API and Format

Documentation

Uh oh!

beryllw commented May 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

luoyuxia commented May 9, 2026

Uh oh!

loserwang1024 commented May 11, 2026

Uh oh!

beryllw commented May 11, 2026

Uh oh!

loserwang1024 commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

beryllw commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

loserwang1024 commented May 12, 2026 •

edited

Loading