Add reservation as source for kvm capacity metrics#652
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughConsolidated four per-host KVM capacity metrics into a single Changes
Sequence DiagramsequenceDiagram
participant Collector as Collect
participant HypervisorAPI as Hypervisor API
participant ReservationAPI as Reservation API
participant Aggregator as Aggregator
participant Emitter as Metric Emitter
Collector->>HypervisorAPI: List Hypervisor objects
HypervisorAPI-->>Collector: Hypervisors (capacity, allocations)
Collector->>ReservationAPI: List Reservation objects
ReservationAPI-->>Collector: Reservations
Collector->>Aggregator: aggregateReservationsByHost(reservations)
Aggregator->>Aggregator: Filter Ready & Nova, group by host, compute reserved & failover
Aggregator-->>Collector: per-host aggregates (reserved, failover)
Collector->>Emitter: Emit total capacity (cpu, ram)
Collector->>Emitter: Emit usage metrics with type: utilized, reserved, failover, payg
Emitter-->>Collector: Metrics emitted
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go (1)
596-653: These assertions no longer prove the metric consolidation.
actualMetricskeeps only labels and gauge values, and the loop only checks that expected pairs exist. The test can still pass if the old per-type metric families are emitted or if extra duplicate series appear. Please keep the metric family/descriptor in the assertion and fail on unexpected series in the cases that validate the new unifiedtypelabel.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/knowledge/kpis/plugins/compute/resource_capacity_kvm.go`:
- Around line 108-119: The loop that groups Ready reservations by
reservation.Status.Host must skip reservations that are not in the KVM/Nova
scheduling domain to avoid cross-domain host-name collisions: inside the for _,
reservation := range reservations loop (before using reservation.Status.Host or
switch on reservation.Spec.Type), check reservation.Spec.SchedulingDomain and
continue unless it is empty (legacy) or equals the KVM/Nova domain constant(s)
(e.g. v1alpha1.SchedulingDomainKVM or v1alpha1.SchedulingDomainNova); treat
empty SchedulingDomain according to your backward-compat policy (keep or skip),
then proceed with the existing ready-condition and host checks and the switch on
reservation.Spec.Type.
- Around line 172-176: The Collect method currently aborts when k.Client.List
for reservations fails; instead, catch the error, log it (using slog.Error) and
continue with an empty v1alpha1.ReservationList so hypervisor-only metrics still
emit. Concretely, in resource_capacity_kvm.go inside the Collect function:
replace the early return after k.Client.List(context.Background(), reservations)
with code that logs the error and leaves `reservations` as an empty list (or
nil-safe) so subsequent loops/filters that reference `reservations.Items` handle
the empty set and the collector proceeds to emit hypervisor metrics. Ensure the
existing variable name `reservations` and the call `k.Client.List` remain
referenced so only the control flow changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c25fbd82-2ad3-48a2-acd9-2bd6a92eeb96
📒 Files selected for processing (2)
internal/knowledge/kpis/plugins/compute/resource_capacity_kvm.gointernal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go (1)
27-43:⚠️ Potential issue | 🟠 MajorMake the collector assertion exact.
The current matcher only checks that each expected label/value pair appears somewhere. The first two skip-path cases are effectively no-ops today, and the
kvmMetricLabelsprojection also hides extra labels and omitted-vs-empty labels. At minimum, assert the emitted metric count; ideally compare the full label set instead of a fixed projection.💡 Tighten the assertion
actualMetrics = append(actualMetrics, kvmExpectedMetric{ Labels: labels, Value: m.GetGauge().GetValue(), }) } + if len(actualMetrics) != len(tt.expectedMetrics) { + t.Fatalf("expected %d metrics, got %d", len(tt.expectedMetrics), len(actualMetrics)) + } + // Verify each expected metric is present with the correct value for _, expected := range tt.expectedMetrics {Once this is in place, the partial
expectedMetricstables should be filled out so the test pins the full emitted set.Also applies to: 596-653
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go` around lines 27 - 43, The test currently uses a partial projection type (kvmMetricLabels) and loose matcher so it doesn't verify emitted metrics exactly; update the collector assertion to assert the total emitted metric count and compare each metric's full label set and value against a fully populated expectedMetrics slice (use kvmExpectedMetric for expected entries), replacing the projection-based comparison with an exact label map comparison (including distinguishing omitted vs empty labels and extra labels), and then fill out the expectedMetrics table entries accordingly so the test pins the complete emitted metrics.
🧹 Nitpick comments (1)
internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go (1)
711-738: Also pin the over-allocation boundary.The nil-
CommittedResourceReservationcase is useful, butaggregateReservationsByHost()also clamps negative “not in use” values to zero. A table entry where allocations exceed the reservation would keep that guard covered.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go` around lines 711 - 738, Add another table test case in resource_capacity_kvm_test.go that exercises aggregateReservationsByHost() when allocations exceed committed reservation to verify clamping to zero; specifically create a v1alpha1.Reservation with Spec.Type set to v1alpha1.ReservationTypeCommittedResource, a non-nil Spec.CommittedResourceReservation with lower cpu/memory than the Resources map, and Status showing it as ready, then set expectedCommittedNotInUse for that host to zero (empty resource.Quantity) and expectedFailover as appropriate so the test checks the negative “not in use” is clamped to zero.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go`:
- Around line 27-43: The test currently uses a partial projection type
(kvmMetricLabels) and loose matcher so it doesn't verify emitted metrics
exactly; update the collector assertion to assert the total emitted metric count
and compare each metric's full label set and value against a fully populated
expectedMetrics slice (use kvmExpectedMetric for expected entries), replacing
the projection-based comparison with an exact label map comparison (including
distinguishing omitted vs empty labels and extra labels), and then fill out the
expectedMetrics table entries accordingly so the test pins the complete emitted
metrics.
---
Nitpick comments:
In `@internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go`:
- Around line 711-738: Add another table test case in
resource_capacity_kvm_test.go that exercises aggregateReservationsByHost() when
allocations exceed committed reservation to verify clamping to zero;
specifically create a v1alpha1.Reservation with Spec.Type set to
v1alpha1.ReservationTypeCommittedResource, a non-nil
Spec.CommittedResourceReservation with lower cpu/memory than the Resources map,
and Status showing it as ready, then set expectedCommittedNotInUse for that host
to zero (empty resource.Quantity) and expectedFailover as appropriate so the
test checks the negative “not in use” is clamped to zero.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c74e06dc-a2f3-4db8-ba2f-7542f815a0c1
📒 Files selected for processing (2)
internal/knowledge/kpis/plugins/compute/resource_capacity_kvm.gointernal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/knowledge/kpis/plugins/compute/resource_capacity_kvm.go
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go (1)
624-679:⚠️ Potential issue | 🟡 MinorMake this an exact metric-schema assertion.
This is still a subset check: unknown labels are silently dropped, and there is no exact series-count assertion. As written, several table rows will still pass if
Collect()emits extra/duplicate metrics or if the unified usage metric grows an unintended label.Possible tightening
case "external_customer": labels.ExternalCustomer = label.GetValue() case "maintenance": labels.Maintenance = label.GetValue() + default: + t.Fatalf("unexpected label %q on metric %s", label.GetName(), metricName) } } - // Verify each expected metric is present with the correct value and metric name. + if len(actualMetrics) != len(tt.expectedMetrics) { + t.Fatalf("unexpected metric count: expected %d, got %d", len(tt.expectedMetrics), len(actualMetrics)) + } + + // Verify each expected metric is present with the correct value and metric name. for _, expected := range tt.expectedMetrics {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go` around lines 624 - 679, The test currently performs a subset check when comparing tt.expectedMetrics to actualMetrics (built in the metric parsing loop and represented by kvmExpectedMetric), which allows extra/duplicate metrics or unknown labels to pass; change the assertion to an exact schema and series-count check by (1) verifying len(actualMetrics) == len(tt.expectedMetrics) before per-metric comparisons, (2) ensuring each actual metric has exactly the expected label set (no extra/unknown labels) rather than relying on struct equality that can silently ignore extras—compare label keys and values explicitly for kvmMetricLabels, (3) fail on duplicate series by detecting multiple actual entries matching a single expected entry, and (4) keep the existing value equality check for Value; apply these checks around the existing loop that iterates tt.expectedMetrics and the metric collection from Collect().
🧹 Nitpick comments (1)
internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go (1)
684-768: Add one non-Nova reservation case here.Every reservation fixture in this table uses
SchedulingDomainNova, so the new branch that skips other scheduling domains is still unpinned. A ready reservation on a host with any non-NovaSchedulingDomainshould leave both maps empty.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go` around lines 684 - 768, Add a test case inside TestAggregateReservationsByHost that uses a reservation whose Spec.SchedulingDomain is a non-Nova value (e.g., string("other-domain")), with Status.Host set and a Ready condition true and some Resources, and assert that both expectedFailover and expectedCommittedNotInUse are empty; this pins the branch in AggregateReservationsByHost that should skip non-Nova scheduling domains — update the tests slice by adding a case similar to existing ones but with Spec.SchedulingDomain != string(v1alpha1.SchedulingDomainNova) and expected maps empty so the function (tested by TestAggregateReservationsByHost) is exercised for non-Nova reservations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go`:
- Around line 624-679: The test currently performs a subset check when comparing
tt.expectedMetrics to actualMetrics (built in the metric parsing loop and
represented by kvmExpectedMetric), which allows extra/duplicate metrics or
unknown labels to pass; change the assertion to an exact schema and series-count
check by (1) verifying len(actualMetrics) == len(tt.expectedMetrics) before
per-metric comparisons, (2) ensuring each actual metric has exactly the expected
label set (no extra/unknown labels) rather than relying on struct equality that
can silently ignore extras—compare label keys and values explicitly for
kvmMetricLabels, (3) fail on duplicate series by detecting multiple actual
entries matching a single expected entry, and (4) keep the existing value
equality check for Value; apply these checks around the existing loop that
iterates tt.expectedMetrics and the metric collection from Collect().
---
Nitpick comments:
In `@internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go`:
- Around line 684-768: Add a test case inside TestAggregateReservationsByHost
that uses a reservation whose Spec.SchedulingDomain is a non-Nova value (e.g.,
string("other-domain")), with Status.Host set and a Ready condition true and
some Resources, and assert that both expectedFailover and
expectedCommittedNotInUse are empty; this pins the branch in
AggregateReservationsByHost that should skip non-Nova scheduling domains —
update the tests slice by adding a case similar to existing ones but with
Spec.SchedulingDomain != string(v1alpha1.SchedulingDomainNova) and expected maps
empty so the function (tested by TestAggregateReservationsByHost) is exercised
for non-Nova reservations.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6d777d07-c81f-4653-9fe8-ed3c2961d405
📒 Files selected for processing (2)
internal/knowledge/kpis/plugins/compute/resource_capacity_kvm.gointernal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go
✅ Files skipped from review due to trivial changes (1)
- internal/knowledge/kpis/plugins/compute/resource_capacity_kvm.go
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go`:
- Around line 661-679: The test loop currently treats expectedMetrics as a
subset by only ensuring each expected entry exists in actualMetrics, which lets
extra or duplicate series slip through (breaking cases like single hypervisor
with nil/zero capacity and allowing unexpected cortex_kvm_host_capacity_usage
series); update the assertion to enforce exact equality between actualMetrics
and expectedMetrics (e.g., compare lengths and that every actual metric matches
an expected one) or, for the empty-capacity cases, add an explicit check that no
cortex_kvm_host_capacity_usage metrics are emitted (or that all emitted values
are zero) so the test fails on duplicates/unexpected series—make these changes
where expectedMetrics and actualMetrics are compared in the test loop.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b3d2a340-4fed-41ff-9654-ecc10ab6ae84
📒 Files selected for processing (2)
internal/knowledge/kpis/plugins/compute/resource_capacity_kvm.gointernal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go
✅ Files skipped from review due to trivial changes (1)
- internal/knowledge/kpis/plugins/compute/resource_capacity_kvm.go
internal/knowledge/kpis/plugins/compute/resource_capacity_kvm_test.go
Outdated
Show resolved
Hide resolved
Test Coverage ReportTest Coverage 📊: 68.3% |
Changes
reservedandfailoverusage types