Skip to content

Conversation

@MoralCode
Copy link
Contributor

@MoralCode MoralCode commented Jan 19, 2026

This avoids some race conditions with the startup process that could create issues, especially on first initialization as noted in #3548 but also observed by me (rabbit had some

Description

  • Please include a summary of the change.

This PR fixes #3548
This pr also fixes #3612 (i think)

Thanks to @Sukuna0007Abhi and @guptapratykshh for submitting PRs on this that each addressed different parts of the issue. This PR is the combination of both works and you have both been credited in the commit message

Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

@MoralCode
Copy link
Contributor Author

this is now part of our prod config.

@MoralCode MoralCode added the ready Items tested and seeking additional approvals or a merge. Usually for items under active development label Jan 21, 2026
@MoralCode MoralCode added this to the v0.93.0 milestone Jan 21, 2026
@MoralCode
Copy link
Contributor Author

MoralCode commented Jan 21, 2026

Copilot analysis of failed job:

The job failed because RabbitMQ could not read its ".erlang.cookie" file due to a permission error:
Code

Error when reading /var/lib/rabbitmq/.erlang.cookie: eacces

theory: Seems to be due to the container not running as root

Copy link
Member

@sgoggins sgoggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

…efore augur starts.

This avoids some race conditions with the startup process that could create issues, especially on first initialization.

Co-Authored-By: guptapratykshh <[email protected]>
Co-Authored-By: Sukuna0007Abhi <[email protected]>
Signed-off-by: Adrian Edwards <[email protected]>
@sgoggins sgoggins force-pushed the fix/race-conditions branch from 988d8a6 to b346bba Compare January 21, 2026 17:29
@sgoggins sgoggins self-assigned this Jan 21, 2026
@sgoggins
Copy link
Member

@MoralCode : DO we think we should solve for this cookie error? It seems like it needs to be fix for the docker deployments to remain stable.

@sgoggins
Copy link
Member

@MoralCode : I can see running as root would likely fix the podman issue, but since we don't care about rabbitmq persisting its queues across startups for Augur, I think we could configure it to use the /tmp directory instead.

@MoralCode
Copy link
Contributor Author

If we want to just merge the postgres health check and gitconfig error fix, we should reopen #3559 since thats what that PR does.

I have not looked into this CI rabbit issue, so maybe the rabbit health check should be a separate PR.

@MoralCode MoralCode marked this pull request as draft January 23, 2026 17:57
@sgoggins sgoggins added waiting This change is waiting for some other changes to land first and removed ready Items tested and seeking additional approvals or a merge. Usually for items under active development labels Jan 26, 2026
@sgoggins sgoggins added the add-feature Adds new features label Feb 2, 2026
@MoralCode MoralCode added the redundant PR is submitted in parallel with another mutually exclusive PR label Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

add-feature Adds new features redundant PR is submitted in parallel with another mutually exclusive PR waiting This change is waiting for some other changes to land first

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Workers randomly die for some reason Augur db somehow gets patially intialized

2 participants