Search before asking
Motivation
For partitioned tables with Paimon lake tiering enabled, downstream batch jobs need a signal that a partition's data is "ready" (i.e., fully tiered). This is the Mark Done mechanism — when a partition has been idle (no new data tiered) for a configurable duration, Fluss should execute Paimon's mark-done actions (e.g., write _SUCCESS file, notify metastore) so downstream schedulers can safely begin processing.
Currently, Fluss tiering operates at the table level with no partition-level idle tracking or mark-done capability. Paimon already has a complete mark-done action framework (PartitionMarkDoneAction), but its trigger mechanism is tightly coupled to Flink checkpoint lifecycle, which doesn't fit Fluss's tiering model.
Solution
Extend the Fluss offset file (V2) with:
partition_tiered_times: map of partitionId -> epoch millis of last tiering commit (wall-clock time, unaffected by compaction)
Anything else?
No response
Willingness to contribute
Search before asking
Motivation
For partitioned tables with Paimon lake tiering enabled, downstream batch jobs need a signal that a partition's data is "ready" (i.e., fully tiered). This is the Mark Done mechanism — when a partition has been idle (no new data tiered) for a configurable duration, Fluss should execute Paimon's mark-done actions (e.g., write
_SUCCESSfile, notify metastore) so downstream schedulers can safely begin processing.Currently, Fluss tiering operates at the table level with no partition-level idle tracking or mark-done capability. Paimon already has a complete mark-done action framework (
PartitionMarkDoneAction), but its trigger mechanism is tightly coupled to Flink checkpoint lifecycle, which doesn't fit Fluss's tiering model.Solution
Extend the Fluss offset file (V2) with:
partition_tiered_times: map of partitionId -> epoch millis of last tiering commit (wall-clock time, unaffected by compaction)Anything else?
No response
Willingness to contribute