Skip to content

fix(evpn): surface cross-class mis-stamped managed netdevs; harden VRF/L3VXLAN validation#578

Merged
lance0 merged 1 commit into
mainfrom
fix/managed-netdev-status-observability
Jun 19, 2026
Merged

fix(evpn): surface cross-class mis-stamped managed netdevs; harden VRF/L3VXLAN validation#578
lance0 merged 1 commit into
mainfrom
fix/managed-netdev-status-observability

Conversation

@lance0

@lance0 lance0 commented Jun 19, 2026

Copy link
Copy Markdown
Owner

Summary

Follow-up to #577. Fixes the one real finding from review plus deferred-lifecycle validation hardening. Status/validation only — no change to compute_managed_netdev_ops or any reap gate (reap stays class-exact).

  • Observability regression fix (ADR-0091 Decision 6). The unconfigured/orphan managed-netdev status scan filtered each link to stamps of the class matching its kind, so a link carrying a rustbgpd ownership stamp of the wrong class for its kind (e.g. a bridge-kind link with a rustbgpd:vxlan:… altname) was silently dropped from status — on the prior code it surfaced as owned-unsafe. Restored an all-class fallback so any rustbgpd-stamped link still surfaces (owned-unsafe), never hidden. The fallback never double-emits, and a vxlan-kind link still reports exactly one row.
  • Accurate owned-unsafe reasons. Reworded across all four classes to cover wrong-class / multiple-stamp / stamp-name-mismatch instead of implying simple non-ownership.
  • Validation hardening (matters when the deferred VRF/L3VXLAN lifecycle lands). Reject reserved VRF table_ids (252–255), a VRF table_id colliding with a [[fib_tables]] table_id, and an L3VXLAN VNI (L3VNI) colliding with a fixed-VNI VXLAN VNI (L2VNI). Operator-provisioned vrf=/bridge= references are intentionally NOT hard-validated (legitimate; fail-closed at runtime).

Testing

  • cargo test -p rustbgpd-evpn-linux managed_netdev_status (cross-class mis-stamp surfaced; vxlan-kind emits exactly one; correctly-stamped orphan still orphaned)
  • cargo test -p rustbgpd config::tests::managed_netdevs (reserved-table-id, vrf/fib_tables collision, l3vxlan/vxlan VNI collision, valid multi-class loads)
  • cargo clippy --workspace --all-targets -- -D warnings, cargo fmt --all -- --check
  • pre-push hook: fmt, clippy, test, doc

…F/L3VXLAN validation

- restore all-class visibility in the managed-netdev status scan: a link
  carrying a rustbgpd ownership stamp of a class that does not match its kind
  is now reported owned-unsafe instead of being silently dropped, satisfying
  ADR-0091 Decision 6 (fail-closed states must be observable); the fallback
  never double-emits and reap stays class-exact (status-only change)
- reword the owned-unsafe status reasons across all four classes to accurately
  cover wrong-class / multiple-stamp / stamp-name-mismatch
- reject reserved VRF table_ids (252-255), a VRF table_id colliding with a
  [[fib_tables]] table_id, and an L3VXLAN VNI (L3VNI) colliding with a
  fixed-VNI VXLAN VNI (L2VNI)
@lance0 lance0 merged commit 57b92d4 into main Jun 19, 2026
60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant