fix(deploy): rsync nsfw-scanner without owner/group preservation#6837
Merged
Conversation
The first deploy after #6833 merged failed at the new `restart:nsfw-scanner` task because: 1. `/opt/nsfw-scanner/` is owned by `root:root` (legacy from manual bootstrap), so the `deploy` user could not write into it. 2. `rsync -a` implies `-o -g` (preserve owner/group), which fails for any user that does not have `CAP_CHOWN` (i.e. anyone non-root) — `rsync: chgrp "/opt/nsfw-scanner/." failed: Operation not permitted` followed by `mkstemp ... Permission denied`. Fix: - Drop the `-a` flag in favour of `-rlt --delete` — the only attributes worth preserving across releases are recursion, symlinks, and mtimes. Owner/group/perms are managed by the destination directory's bootstrap state, not by rsync. - Document the missing `chown -R deploy:deploy /opt/nsfw-scanner` bootstrap step in `NSFW-Content-Scanning.md` and in the task's leading comment so the next operator does not trip the same wire. One-time fix on prod before the next deploy: chown -R deploy:deploy /opt/nsfw-scanner The previous deploy went through `deploy:symlink`, `restart:nginx`, and `restart:php-fpm` before failing, so Catroweb itself is already running the new release; the old nsfw-scanner container is untouched and still serving on `127.0.0.1:5000` (rsync errored before it could write anything, and the container remove + compose-up steps never ran). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix-forward for the failed deploy run 25683832620 after #6833 merged.
The
restart:nsfw-scannertask indeploy.phpran for the first time and failed at rsync:Two root causes:
/opt/nsfw-scanner/is owned byroot:root— leftover from the manual bootstrap that originally seeded the directory. Thedeployuser cannot write into it.rsync -aimplies-o -g(preserve owner/group), which fails for any user withoutCAP_CHOWN— exactly the situation under SSH-as-deploy.Fix
-a --deleteto-rlt --delete --no-owner --no-group-equivalent (-rlt). The only attributes worth carrying across releases are recursion, symlinks, and mtimes; ownership is managed by the bootstrap state of the destination directory, not by rsync.chown -R deploy:deploy /opt/nsfw-scannerstep to the one-time prerequisites indocs/operations/NSFW-Content-Scanning.mdand the task's leading comment indeploy.php.Required action on prod before the next deploy
ssh root@95.216.224.116 'chown -R deploy:deploy /opt/nsfw-scanner'After that, the next deploy will rsync the new sources in, force-remove the legacy
docker runcontainer (first-run migration code path), and rundocker compose up -d --buildfor the first time. Expect ~5 min extra deploy time for the model download.Current prod state (no user-visible breakage)
The failed deploy still completed
deploy:symlink,restart:nginx, andrestart:php-fpmbefore erroring, so Catroweb itself is serving the new release. The old nsfw-scanner container is untouched (rsync errored before it wrote anything, and the rm/compose-up steps never ran). Image uploads continue to hit the existing scanner on127.0.0.1:5000exactly as before.Test plan
chownon prodDeploymentworkflow →workflow_dispatchrestart:nsfw-scannertask succeed: expect rsync OK → "Migrating nsfw-scanner from raw docker-run to docker compose" log →docker compose up -d --buildrunsssh root@95.216.224.116 'docker inspect nsfw-scanner --format "{{index .Config.Labels \"com.docker.compose.project\"}}"'returnsnsfw-scanner(proof migration completed)curl -sf http://127.0.0.1:5000/healthon prod returns{"status":"ok"}🤖 Generated with Claude Code