Skip to content

Conversation

@rsmarples
Copy link
Member

Adjust the code flow so that the same errors would be caught after the final recv.
This ensures we read what is really meant for us and not something silly.

Should fix #555.

Adjust the code flow so that the same errors would be caught
after the final recv.
This ensures we read what is really meant for us and not
something silly.

Should fix #555.
@coderabbitai
Copy link

coderabbitai bot commented Nov 17, 2025

Walkthrough

Reworked privsep message I/O to a two-phase recvmsg flow (peek sizing then mandatory consume), moved per-message buffers to optional heap allocation, removed per-process data cleanup hooks, and removed two fields from the public process struct.

Changes

Cohort / File(s) Summary
Privsep root I/O and buffer handling
src/privsep-root.c
Replaced single MSG_PEEK read with msghdr/iovec + recvmsg(MSG_PEEK
Process struct and cleanup behavior
src/privsep.h, src/privsep.c
Removed psp_data and psp_freedata fields from struct ps_process; removed runtime cleanup of psp_data in ps_freeprocess, so per-process data callback/path no longer invoked during process free.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Review focus:
    • Correctness of new two-step recvmsg control flow and guarantee that peeked messages are always consumed (including on error paths).
    • Memory semantics around psr_mallocdata / psr_data allocation, ownership, and absence of the previous in-place mdata lifecycle.
    • Error mapping and logging for MSG_TRUNC, short reads, and total-length mismatches.
    • Effects of removing psp_data / psp_freedata and skipping per-process data cleanup on callers and lifecycle (ensure no leaked resources or dangling pointers).
    • Call sites to ps_root_writeerror and other modified APIs to confirm consistent zero-length payload handling.

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: ensuring a real recv occurs after MSG_PEEK, which aligns with the core code modifications switching from simple MSG_PEEK to a two-phase recv approach.
Description check ✅ Passed The description is related to the changeset, explaining that the code flow is adjusted to re-check error conditions after a real recv following MSG_PEEK, which matches the implementation details.
Linked Issues check ✅ Passed The PR implements a two-phase recv approach (MSG_PEEK followed by real recv) that addresses the buffer handling issue reported in #555 by ensuring proper error detection on the final recv path.
Out of Scope Changes check ✅ Passed All changes are directly related to the recv mechanism and buffer handling objectives: modifications to privsep-root.c for two-phase recv, removal of cleanup callbacks in privsep.c/h consistent with the new allocation strategy.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch privsep-recvmsg

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd75b08 and 16a0bfb.

📒 Files selected for processing (3)
  • src/privsep-root.c (6 hunks)
  • src/privsep.c (0 hunks)
  • src/privsep.h (0 hunks)
💤 Files with no reviewable changes (2)
  • src/privsep.h
  • src/privsep.c
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/privsep-root.c
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd
  • GitHub Check: freebsd
  • GitHub Check: openbsd

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 385ff24 and f4446d5.

📒 Files selected for processing (1)
  • src/privsep-root.c (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd
  • GitHub Check: freebsd
  • GitHub Check: openbsd

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/privsep-root.c (1)

129-138: Critical: Fix MSG_WAITALL hang by using incoming message size.

Lines 134 and 137 set iov_len from the context buffer sizes (psr_ctx->psr_mdatalen, psr_ctx->psr_datalen), which can be larger than the actual incoming message size psr_error->psr_datalen. When MSG_WAITALL is used with an iov_len larger than available data, recvmsg blocks indefinitely waiting for bytes that will never arrive, wedging the privileged proxy and causing cascading failures like "No buffer space available."

This is the same critical issue identified in the past review and must be fixed.

Apply this diff to clamp iov_len to the actual incoming message size:

 	if (psr_error->psr_datalen != 0) {
 		if (psr_ctx->psr_usemdata) {
 			iov[1].iov_base = psr_ctx->psr_mdata;
-			/* psr_mdatalen could be smaller then psr_datalen
-			 * if the above malloc failed. */
-			iov[1].iov_len =
-			    MIN(psr_ctx->psr_mdatalen, psr_ctx->psr_datalen);
+			/* Clamp to actual incoming size to avoid MSG_WAITALL hang.
+			 * If buffer is too small, MSG_TRUNC will catch it. */
+			iov[1].iov_len = psr_error->psr_datalen;
+			if (iov[1].iov_len > psr_ctx->psr_mdatalen)
+				iov[1].iov_len = psr_ctx->psr_mdatalen;
 		} else {
 			iov[1].iov_base = psr_ctx->psr_data;
-			iov[1].iov_len = psr_ctx->psr_datalen;
+			iov[1].iov_len = psr_error->psr_datalen;
+			if (iov[1].iov_len > psr_ctx->psr_datalen)
+				iov[1].iov_len = psr_ctx->psr_datalen;
 		}
 	}
🧹 Nitpick comments (1)
src/privsep-root.c (1)

102-111: Consider removing MSG_WAITALL from the MSG_PEEK operation.

Using MSG_WAITALL with MSG_PEEK may cause unnecessary blocking when the message header hasn't fully arrived yet. Since this is just a peek operation to determine buffer size, and you're checking for short reads immediately after (line 110), MSG_PEEK alone should suffice.

Apply this diff:

-	len = recvmsg(fd, &msg, MSG_PEEK | MSG_WAITALL);
+	len = recvmsg(fd, &msg, MSG_PEEK);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4446d5 and 834a322.

📒 Files selected for processing (1)
  • src/privsep-root.c (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: freebsd
  • GitHub Check: openbsd
🔇 Additional comments (2)
src/privsep-root.c (2)

113-127: LGTM! Proper handling of allocation failure.

The allocation failure handling is correct: when realloc fails, the code leaves the smaller buffer in place and relies on MSG_TRUNC detection in the subsequent recvmsg to signal ENOBUFS. This ensures the message is properly discarded from the queue.


141-151: LGTM! Comprehensive validation of received message.

The final receive path correctly:

  • Centralizes the actual message consumption for both early failures and normal flow
  • Validates minimum header size
  • Detects truncation via MSG_TRUNC and returns ENOBUFS
  • Ensures exact size match between received data and expected header + payload

Allows us to distinguish from a shortened header.
Log the result and expected sizes to try and diagnose.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/privsep-root.c (2)

118-140: Dynamic buffer growth and iov_len capping avoid hangs and handle ENOBUFS cleanly

The realloc of psr_ctx->psr_mdata and the subsequent setup:

  • Preserve the old buffer if realloc fails, so the pointer and size remain valid.
  • Use iov[1].iov_base from either psr_mdata or the caller’s buffer as before.
  • Critically, cap iov[1].iov_len to psr_error->psr_datalen, so the total requested length is never larger than the actual message size, eliminating the previous MSG_WAITALL hang scenario when the context buffer was larger than the incoming payload.
  • In the “realloc failed” case, psr_mdatalen < psr_error->psr_datalen, which will force a truncated read in the final recvmsg, correctly surfacing ENOBUFS while draining the queue.

Only tiny nit: the new comment still says “failed to malloc” even though realloc is used here.

-       /* If we failed to malloc then psr_mdatalen will be smaller
+       /* If we failed to realloc then psr_mdatalen will be smaller

142-157: Final recvmsg validation is robust; comment could be tightened to match new invariants

The final recvmsg(fd, &msg, MSG_WAITALL) plus:

  • Short‑header check (len < sizeof(*psr_error)),
  • Explicit MSG_TRUNC → ENOBUFS mapping,
  • Exact size check against sizeof(*psr_error) + psr_error->psr_datalen with EBADMSG and logging,

gives a solid guard against partial or malformed records and guarantees the caller sees consistent errors while the message is fully consumed from the queue. With the new iov_len capping above, the “receive buffers are bigger than the actual message” case cannot actually occur anymore, so the comment at Line 143 is now slightly misleading; consider updating it to say explicitly that the total requested length is bounded by the announced psr_datalen, which is what prevents stalling with MSG_WAITALL.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 595c305 and 36d3dad.

📒 Files selected for processing (1)
  • src/privsep-root.c (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
  • GitHub Check: netbsd (--disable-dhcp6)
  • GitHub Check: netbsd (--disable-ipv6, -DSMALL)
  • GitHub Check: netbsd (--disable-ipv4ll)
  • GitHub Check: netbsd (--disable-ipv6)
  • GitHub Check: netbsd (-DSMALL)
  • GitHub Check: netbsd
  • GitHub Check: netbsd (--disable-ipv4, -DSMALL)
  • GitHub Check: netbsd (--disable-arp, -DSMALL)
  • GitHub Check: netbsd (--disable-arp)
  • GitHub Check: netbsd (--disable-ipv4)
  • GitHub Check: freebsd
  • GitHub Check: openbsd
🔇 Additional comments (1)
src/privsep-root.c (1)

89-112: Two-phase recvmsg with MSG_PEEK looks correct and keeps the queue in sync

The introduction of struct msghdr msg and the initial recvmsg(fd, &msg, MSG_PEEK | MSG_WAITALL) followed by an unconditional non‑peek recvmsg (via goto recv when needed) ensures you (a) size buffers based on the header, and (b) always actually consume the message from the SOCK_SEQPACKET queue. This addresses the prior risk of acting only on peeked data without draining the record, without introducing obvious new failure modes.

@perkelix
Copy link
Contributor

perkelix commented Dec 3, 2025

Does this one seem ready to merge too?

@rsmarples
Copy link
Member Author

Does this one seem ready to merge too?

I would like some feedback on it first as I still don't know if it fixes people issues as I cannot replicate the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SIOCSPFXFLUSH_IN6: No buffer space available after updating to 10.3.0 (FreeBSD 14.3)

3 participants