Integrate window function optimization rules into IoTDB by Sh-Zh-7 · Pull Request #16953 · apache/iotdb

Sh-Zh-7 · 2025-12-25T19:21:06Z

Description

This PR introduce the following optimization rules:

PruneWindowColumns
RemoveRedundantWindow
PruneOrderByInWindowAggregation
GatherAndMergeWindows
ReplaceWindowWithRowNumber
PushdownLimitIntoWindow
PushdownFilterIntoWindow

And its corresponding nodes and operators.

codecov · 2025-12-25T20:22:47Z

Codecov Report

❌ Patch coverage is 18.63636% with 1253 lines in your changes missing coverage. Please review.
✅ Project coverage is 39.19%. Comparing base (9abac5c) to head (8b9a4c6).
⚠️ Report is 114 commits behind head on master.

Files with missing lines	Patch %	Lines
...tion/operator/GroupedTopNRowNumberAccumulator.java	0.00%	182 Missing ⚠️
...execution/operator/RowReferenceTsBlockManager.java	0.00%	153 Missing ⚠️
...n/operator/process/window/TopKRankingOperator.java	0.00%	91 Missing ⚠️
...l/aggregation/grouped/array/IntArrayFIFOQueue.java	0.00%	90 Missing ⚠️
...ngine/plan/relational/planner/node/ValuesNode.java	0.00%	89 Missing ⚠️
...gregation/grouped/array/LongBigArrayFIFOQueue.java	0.00%	81 Missing ⚠️
...xecution/operator/GroupedTopNRowNumberBuilder.java	0.00%	78 Missing ⚠️
...eryengine/plan/planner/TableOperatorGenerator.java	0.00%	78 Missing ⚠️
...ion/operator/process/window/RowNumberOperator.java	0.00%	76 Missing ⚠️
.../planner/iterative/rule/GatherAndMergeWindows.java	51.04%	70 Missing ⚠️
... and 17 more

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #16953      +/-   ##
============================================
+ Coverage     39.02%   39.19%   +0.16%     
- Complexity      207      282      +75     
============================================
  Files          5021     5125     +104     
  Lines        333377   342995    +9618     
  Branches      42431    43747    +1316     
============================================
+ Hits         130110   134442    +4332     
- Misses       203267   208553    +5286

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR integrates window function optimization rules into IoTDB, adding support for optimizing window functions through specialized plan nodes (TopKRankingNode, RowNumberNode, ValuesNode) and corresponding operators, along with optimization rules to transform and optimize window operations.

Changes:

Added new plan nodes: TopKRankingNode, RowNumberNode, and ValuesNode for specialized window operations
Implemented optimization rules: PruneWindowColumns, RemoveRedundantWindow, GatherAndMergeWindows, ReplaceWindowWithRowNumber, PushDownLimitIntoWindow, PushDownFilterIntoWindow
Added operators: TopKRankingOperator, RowNumberOperator, ValuesOperator with supporting data structures

Reviewed changes

Copilot reviewed 35 out of 35 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
HeapTraversal.java	Utility for navigating binary heap structures
TopKRankingNode.java	Plan node for top-k ranking operations
RowNumberNode.java	Plan node for row numbering operations
ValuesNode.java	Plan node for constant value operations
RemoveRedundantWindow.java	Rule to remove empty window operations
ReplaceWindowWithRowNumber.java	Rule to replace window with row number (incomplete)
PushDownLimitIntoWindow.java	Rule to push limit into window operations
PushDownFilterIntoWindow.java	Rule to push filter into window operations
GatherAndMergeWindows.java	Rule to merge adjacent window operations
TopKRankingOperator.java	Operator for executing top-k ranking
RowNumberOperator.java	Operator for computing row numbers
ValuesOperator.java	Operator for constant values
Supporting data structures	NoChannelGroupByHash, FIFO queues, grouped TopN builders

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-13T04:15:34Z

.../main/java/org/apache/iotdb/db/queryengine/plan/relational/planner/node/TopKRankingNode.java

+  public List<Symbol> getOutputSymbols() {
+    return Collections.singletonList(rankingSymbol);
+  }


The method getOutputSymbols returns only the ranking symbol, but it should return all output symbols including those from the child node. This inconsistency with other node implementations (like RowNumberNode which properly handles output symbols) will cause incorrect query planning.

Copilot · 2026-01-13T04:15:34Z

...rc/main/java/org/apache/iotdb/db/queryengine/plan/relational/planner/node/RowNumberNode.java

+
+  @Override
+  public List<Symbol> getOutputSymbols() {
+    return Collections.singletonList(rowNumberSymbol);


The method getOutputSymbols returns only the row number symbol, but it should return all output symbols including those from the child node. This is inconsistent with how other operators handle output symbols and will cause query planning errors.

Suggested change

return Collections.singletonList(rowNumberSymbol);

return ImmutableList.<Symbol>builder()

.addAll(getChild().getOutputSymbols())

.add(rowNumberSymbol)

.build();

Copilot · 2026-01-13T04:15:34Z

...ache/iotdb/db/queryengine/plan/relational/planner/optimizations/UnaliasSymbolReferences.java

+      Map<Symbol, Symbol> mapping = new HashMap<>(rewrittenSource.getMappings());
+      SymbolMapper mapper = symbolMapper(mapping);
+
+      TopKRankingNode rewrittenTopNRanking = mapper.map(node, rewrittenSource.getRoot());


The variable name 'rewrittenTopNRanking' (line 641) is inconsistent with the node type TopKRankingNode. The name should be 'rewrittenTopKRanking' to match the actual class name and maintain naming consistency.

Copilot · 2026-01-13T04:15:35Z

...anode/src/main/java/org/apache/iotdb/db/queryengine/plan/planner/plan/node/PlanNodeType.java

      case 1036:
        return ExceptNode.deserialize(buffer);
+      case 1037:
+        return TopKNode.deserialize(buffer);


The deserialization case for TABLE_TOPK_RANKING_NODE (1037) is calling TopKNode.deserialize(buffer) instead of TopKRankingNode.deserialize(buffer). This will cause runtime errors when deserializing TopKRankingNode instances.

Copilot · 2026-01-13T04:15:35Z

.../iotdb/db/queryengine/plan/relational/planner/iterative/rule/ReplaceWindowWithRowNumber.java

+
+  @Override
+  public Result apply(WindowNode node, Captures captures, Context context) {
+    return null;


The apply method returns null unconditionally. This rule will never perform any transformation, making it ineffective. The method should implement the actual transformation logic to replace the WindowNode with a RowNumberNode.

Suggested change

return null;

return Result.empty();

Copilot · 2026-01-13T04:15:35Z

...ava/org/apache/iotdb/db/queryengine/execution/operator/process/window/RowNumberOperator.java

+    return result;
+  }
+
+  private void processRow(TsBlock tsBlock, int position, long rowNumber) {


The method processRow accepts three parameters (TsBlock tsBlock, int position, long rowNumber) but is being called with (tsBlock, partitionId, rowCount + 1) at line 121. The second argument should be 'position', not 'partitionId'. This will cause incorrect column access and likely runtime errors.

Copilot · 2026-01-13T04:15:36Z

...ava/org/apache/iotdb/db/queryengine/execution/operator/process/window/RowNumberOperator.java

+
+  private void processRow(TsBlock tsBlock, int position, long rowNumber) {
+    // Check max rows per partition limit
+    if (maxRowsPerPartition.isPresent() && rowNumber >= maxRowsPerPartition.get()) {


The condition checks if rowNumber >= maxRowsPerPartition, but it should check rowNumber > maxRowsPerPartition. With the current logic, when rowNumber equals maxRowsPerPartition (which is the maximum allowed), the row is incorrectly skipped. For example, if maxRowsPerPartition is 5, row 5 will be skipped even though rows 1-5 should be included.

Suggested change

if (maxRowsPerPartition.isPresent() && rowNumber >= maxRowsPerPartition.get()) {

if (maxRowsPerPartition.isPresent() && rowNumber > maxRowsPerPartition.get()) {

Copilot · 2026-01-13T04:15:36Z

...a/org/apache/iotdb/db/queryengine/execution/operator/process/window/TopKRankingOperator.java

+      boolean generateRanking,
+      Optional<Integer> hashChannel,
+      int expectedPositions,
+      Optional<Long> maxPartialMemory) {
+    this.operatorContext = operatorContext;
+    this.inputOperator = inputOperator;
+    this.rankingType = rankingType;
+    this.inputTypes = inputTypes;
+    this.partitionChannels = partitionChannels;
+    this.partitionTSDataTypes = partitionTSDataTypes;
+    this.sortChannels = sortChannels;
+    this.sortOrders = sortOrders;
+    this.maxRowCountPerPartition = maxRowCountPerPartition;
+    this.partial = !generateRanking;
+    this.generateRanking = generateRanking;


The TopKRankingOperator constructor parameter 'generateRanking' is used to set 'partial' with inverted logic (partial = !generateRanking at line 105), but then 'generateRanking' is also stored separately. This creates confusing dual state. Additionally, the constructor parameter name at line 92 is 'generateRanking' but the field at line 66 is named 'generateRanking' while the parameter is used to derive 'partial'. Consider using a single boolean field with clear semantics.

Copilot · 2026-01-13T04:15:36Z

...rc/main/java/org/apache/iotdb/db/queryengine/plan/relational/planner/node/RowNumberNode.java

+    return new RowNumberNode(
+        getPlanNodeId(), partitionBy, orderSensitive, rowNumberSymbol, maxRowCountPerPartition);
+  }
+


This method overrides PlanNode.accept; it is advisable to add an Override annotation.

Suggested change

@Override

sonarqubecloud · 2026-01-23T03:13:17Z

Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Finish all rules, nodes and operators.

fa450f1

Sh-Zh-7 marked this pull request as draft December 25, 2025 19:21

Sh-Zh-7 added 2 commits December 31, 2025 01:16

Simple format and fix UT bugs.

db89602

mvn spotless apply.

3457e8f

Sh-Zh-7 marked this pull request as ready for review January 5, 2026 04:56

Sh-Zh-7 added 3 commits January 7, 2026 03:23

Add new nodes and ops to logical and distributed plan.

c66f2dd

mvn spotless apply.

c98a511

mvn spotless apply again.

e7a4b17

JackieTien97 requested a review from Copilot January 13, 2026 04:06

Copilot started reviewing on behalf of JackieTien97 January 13, 2026 04:06 View session

JackieTien97 approved these changes Jan 13, 2026

View reviewed changes

Copilot AI reviewed Jan 13, 2026

View reviewed changes

Sh-Zh-7 added 11 commits January 21, 2026 02:08

Accept copilot's suggestions.

1ff5472

mvn spotless apply.

416daa2

Make my code almost right.

f597c7d

mvn spotless apply.

ffdb70e

Add IT for window function optimization.

ad7a831

mvn spotless::apply -P with-integration-tests

fc8e721

Add UT prototype.

16b5ef9

mvn spotless apply.

024507c

Fix UT bugs.

eb8fc93

Add license header.

8e73059

Add license header to remaining files.

8b9a4c6

JackieTien97 merged commit c39abcb into master Jan 23, 2026
28 of 31 checks passed

JackieTien97 deleted the perf/szh/window_func_optimize branch January 23, 2026 06:15

-    return Collections.singletonList(rowNumberSymbol);
+    return ImmutableList.<Symbol>builder()
+        .addAll(getChild().getOutputSymbols())
+        .add(rowNumberSymbol)
+        .build();

	if (maxRowsPerPartition.isPresent() && rowNumber >= maxRowsPerPartition.get()) {
	if (maxRowsPerPartition.isPresent() && rowNumber > maxRowsPerPartition.get()) {

Conversation

Sh-Zh-7 commented Dec 25, 2025

Description

Uh oh!

codecov bot commented Dec 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Jan 23, 2026

Quality Gate failed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Dec 25, 2025 •

edited

Loading