Skip to content

Move QoS Guard Functionality from Celery to Kombu #2353

@Dkhodos

Description

@Dkhodos

Move QoS Guard Functionality from Celery to Kombu

Background

This issue is related to Celery PR #9863 which implements a "pragmatic Celery‑side QoS guard" to prevent worker stalls and infinite loops. The PR author notes:

This is a pragmatic Celery‑side QoS guard. Longer‑term, the ideal place for this behavior would be in Kombu's QoS, with Celery surfacing the option.

Problem Statement

Celery currently implements QoS guard logic to prevent critical issues with message consumption, but this functionality belongs in Kombu's QoS layer for better architecture and reusability.

Issues the QoS Guard Addresses:

  1. Worker Stalls: When prefetch is disabled or misconfigured, workers can stop processing messages entirely
  2. Infinite Loops: Certain transport/prefetch combinations cause qos.can_consume() to behave incorrectly, leading to 100% CPU usage
  3. Transport Incompatibilities: Some transports (like SQS) don't support traditional AMQP prefetch semantics properly
  4. Head-of-Line Blocking: Long-running tasks can cause starvation when prefetch holds tasks in reserve while workers sit idle

Current State

Celery PR #9863 implements a QoS guard in Celery that wraps channel.qos.can_consume to check reserved_requests against effective concurrency. This prevents workers from fetching new tasks when all execution slots are busy, solving head-of-line blocking issues.

Proposed Solution

Move this QoS guard functionality from Celery to Kombu's QoS implementation, where it architecturally belongs. This would:

  1. Centralize QoS Logic: Put transport-aware QoS behavior in the transport layer
  2. Enable Reuse: Other Kombu consumers besides Celery can benefit from the guard
  3. Improve Maintainability: Reduce code duplication between projects
  4. Better Abstraction: Allow Celery to simply surface configuration options rather than implement guard logic

Implementation Approach

  • Add configurable QoS guard functionality to Kombu's existing QoS classes
  • Provide transport options to enable/configure the guard behavior
  • Ensure backward compatibility (disabled by default)
  • Allow Celery to migrate from its internal guard to Kombu's implementation

Benefits

  • Better Architecture: QoS logic belongs in the transport layer, not the application layer
  • Code Reuse: Other Kombu-based applications can benefit from the guard functionality
  • Reduced Complexity: Celery can focus on task execution rather than transport-level QoS issues
  • Improved Reliability: Centralized QoS handling reduces bugs and edge cases

Related Issues

Acceptance Criteria

  • QoS guard functionality is available in Kombu's QoS classes
  • Guard behavior is configurable via transport options
  • Existing Kombu/Celery code continues to work unchanged (backward compatible)
  • Clear documentation on how to enable and configure the guard
  • Celery can eventually migrate from its internal guard to Kombu's implementation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions