[flink] Fix batch fallback generating mixed split types for primary-key tables by matrixsparse · Pull Request #3296 · apache/fluss

matrixsparse · 2026-05-10T07:28:38Z

Summary

Follow-up fix for #3208.

For primary-key tables in batch mode, when no lake snapshot exists, the previous fallback logic called initPartitionedSplits() / initNonPartitionedSplits(), which internally invokes getSnapshotAndLogSplits(). This method may produce mixed split types — HybridSnapshotLogSplit for buckets with KV snapshots and LogSplit for buckets without — which the Flink connector does not support merging in batch mode.

This fix replaces the fallback path with initLogTablePartitionSplits() / getLogSplit() to generate uniform LogSplit for all buckets, avoiding the mixed split type issue.

Changes

Partitioned tables: initPartitionedSplits() → initLogTablePartitionSplits()
Non-partitioned tables: initNonPartitionedSplits() → getLogSplit(null, null)

…ey tables

matrixsparse · 2026-05-10T07:52:03Z

Hi @luoyuxia, this is a follow-up fix for the mixed split types issue you mentioned in #3208.

Could you PTAL? Thanks! cc @fresh-borzoni

fresh-borzoni

@matrixsparse Ty, LGTM in general, one comment:

It matches spark logic, but at the same time this scenario is a bit dangerous - we have a big table that was never tiered to lake, then we decide to tier it to lake and run batched query through this fallback, instead of using kv_snapshot and replaying log on top, we read from earliest which is potentially a lot of records. So it's not very efficient and potentially OOM prone.

Let's file an issue about this to address separately for Spark and Flink?
cc @luoyuxia WDYT about this plan?

luoyuxia

@matrixsparse Thanks for the quick fix. Left on comments. PTAL

luoyuxia · 2026-05-12T06:25:01Z

+                                // Use log-only splits to avoid generating mixed split
+                                // types (HybridSnapshotLogSplit + LogSplit) for
+                                // primary-key tables, which is not supported.
+                                splits = this.initLogTablePartitionSplits(partitions);


Note it'll just genereate log split without stopping offset which will then nerver stop..

[flink] Fix batch fallback generating mixed split types for primary-k…

7ac1f8c

…ey tables

matrixsparse force-pushed the feature/fix-batch-fallback-mixed-splits branch from 09b5acd to 7ac1f8c Compare May 10, 2026 07:45

fresh-borzoni reviewed May 11, 2026

View reviewed changes

luoyuxia reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flink] Fix batch fallback generating mixed split types for primary-key tables#3296

[flink] Fix batch fallback generating mixed split types for primary-key tables#3296
matrixsparse wants to merge 1 commit into
apache:mainfrom
matrixsparse:feature/fix-batch-fallback-mixed-splits

matrixsparse commented May 10, 2026 •

edited

Loading

Uh oh!

matrixsparse commented May 10, 2026

Uh oh!

fresh-borzoni left a comment

Uh oh!

luoyuxia left a comment

Uh oh!

luoyuxia May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

matrixsparse commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Uh oh!

matrixsparse commented May 10, 2026

Uh oh!

fresh-borzoni left a comment

Choose a reason for hiding this comment

Uh oh!

luoyuxia left a comment

Choose a reason for hiding this comment

Uh oh!

luoyuxia May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

matrixsparse commented May 10, 2026 •

edited

Loading