Skip to content

[Feature] Support Partition Mark Done for Fluss Tiering (Paimon Lake) #3314

@beryllw

Description

@beryllw

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

For partitioned tables with Paimon lake tiering enabled, downstream batch jobs need a signal that a partition's data is "ready" (i.e., fully tiered). This is the Mark Done mechanism — when a partition has been idle (no new data tiered) for a configurable duration, Fluss should execute Paimon's mark-done actions (e.g., write _SUCCESS file, notify metastore) so downstream schedulers can safely begin processing.

Currently, Fluss tiering operates at the table level with no partition-level idle tracking or mark-done capability. Paimon already has a complete mark-done action framework (PartitionMarkDoneAction), but its trigger mechanism is tightly coupled to Flink checkpoint lifecycle, which doesn't fit Fluss's tiering model.

Solution

Extend the Fluss offset file (V2) with:

  • partition_tiered_times: map of partitionId -> epoch millis of last tiering commit (wall-clock time, unaffected by compaction)

Anything else?

No response

Willingness to contribute

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions