Skip to content

Conversation

@krapie
Copy link
Member

@krapie krapie commented Oct 24, 2025

Verify ALB target group health before applying the final weight patch to prevent a race condition where new pods are not yet healthy after a selector change, which can cause the ALB to return 503 errors when routing 100% of traffic.

Fixes: #4433

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this is a chore.
  • The title of the PR is (a) conventional with a list of types and scopes found here, (b) states what changed, and (c) suffixes the related issues number. E.g. "fix(controller): Updates such and such. Fixes #1234".
  • I've signed my commits with DCO
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My builds are green. Try syncing with master if they are not.
  • My organization is added to USERS.md.

@sonarqubecloud
Copy link

@github-actions
Copy link
Contributor

github-actions bot commented Oct 24, 2025

Published E2E Test Results

  4 files    4 suites   3h 34m 54s ⏱️
117 tests 105 ✅  7 💤 5 ❌
478 runs  441 ✅ 28 💤 9 ❌

For more details on these failures, see this check.

Results for commit 2bb4d70.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 24, 2025

Published Unit Test Results

2 351 tests   2 351 ✅  3m 2s ⏱️
  129 suites      0 💤
    1 files        0 ❌

Results for commit 2bb4d70.

♻️ This comment has been updated with latest results.

@zachaller
Copy link
Collaborator

Does this only happen without Ping Pong, I am generally under the impression that ALBs should always use Rollouts ping pong feature.

@krapie
Copy link
Member Author

krapie commented Nov 8, 2025

This issue seems to occur only with ALB + ping-pong strategy. Using the ping-pong strategy prevents the problem, but as noted in issue #4433, there seems to be cases where users wants fixed endpoint during and after the canary rollout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prevent 503s during final canary step with AWS ALB

2 participants