Skip to content

feat(rcv1p): unify cert bootstrap flow and add Windows CA refresh task#8096

Open
rchincha wants to merge 15 commits intomainfrom
origin/rchinchani/rcv1p-2
Open

feat(rcv1p): unify cert bootstrap flow and add Windows CA refresh task#8096
rchincha wants to merge 15 commits intomainfrom
origin/rchinchani/rcv1p-2

Conversation

@rchincha
Copy link
Copy Markdown

@rchincha rchincha commented Mar 16, 2026

What this PR does / why we need it:

https://eng.ms/docs/products/onecert-certificates-key-vault-and-dsms/onecert-customer-guide/autorotationandecr/rcv1ptsg

Which issue(s) this PR fixes:

Fixes #

  • Converges parts/linux/cloud-init/artifacts/init-aks-custom-cloud*.sh into a single file
  • Use cloud location to determine legacy and newer rcv1p behavior
  • Do the same for Windows and also add a periodic refresh job

Copilot AI review requested due to automatic review settings March 16, 2026 05:14
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to unify the custom-cloud CA certificate bootstrap path (removing the separate “operation-requests” init scripts) and adds a Windows scheduled task to periodically refresh custom-cloud CA certificates.

Changes:

  • Windows: add a scheduled task to refresh custom-cloud CA certificates; update Get-CACertificates to support legacy vs “rcv1p” modes keyed off location.
  • Linux: consolidate custom-cloud init to a single init script and update CSE command generation to set a cert-endpoint mode variable.
  • Regenerate multiple custom data / generated command snapshots to reflect the new templates.

Reviewed changes

Copilot reviewed 74 out of 176 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
staging/cse/windows/kubernetesfunc.ps1 Adds CA refresh scheduled task + updates CA retrieval logic and error behavior
parts/windows/kuberneteswindowssetup.ps1 Wires Get-CACertificates -Location and registers refresh task for custom clouds
pkg/agent/variables.go Always injects initAKSCustomCloud payload into cloud-init data
pkg/agent/const.go Removes separate custom-cloud init script constants; keeps single init script
pkg/agent/baker.go Simplifies GetTargetEnvironment; notes IsAKSCustomCloud as deprecated
parts/linux/cloud-init/artifacts/cse_cmd.sh Updates CSE command to set cert endpoint mode + run custom-cloud init script
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-operation-requests.sh Deleted (custom-cloud init consolidation)
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-operation-requests-mariner.sh Deleted (custom-cloud init consolidation)
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-mariner.sh Deleted (custom-cloud init consolidation)
aks-node-controller/parser/templates/cse_cmd.sh.gtpl Mirrors CSE command template updates for aks-node-controller parser
aks-node-controller/parser/testdata/Compatibility+EmptyConfig/generatedCSECommand Regenerated snapshot for new CSE cmd template
aks-node-controller/parser/testdata/AzureLinuxv2+Kata+DisableUnattendedUpgrades=false/generatedCSECommand Regenerated snapshot for new CSE cmd template
aks-node-controller/parser/testdata/AKSUbuntu2204+SSHStatusOn/generatedCSECommand Regenerated snapshot for new CSE cmd template
aks-node-controller/parser/testdata/AKSUbuntu2204+EnablePubkeyAuth/generatedCSECommand New snapshot for new template output
aks-node-controller/parser/testdata/AKSUbuntu2204+DisablePubkeyAuth/generatedCSECommand New snapshot for new template output
aks-node-controller/parser/testdata/AKSUbuntu2204+DefaultPubkeyAuth/generatedCSECommand New snapshot for new template output
aks-node-controller/parser/testdata/AKSUbuntu2204+CustomOSConfig/generatedCSECommand Regenerated snapshot for new CSE cmd template
aks-node-controller/parser/testdata/AKSUbuntu2204+CustomCloud/generatedCSECommand Regenerated snapshot for new CSE cmd template
aks-node-controller/parser/testdata/AKSUbuntu2204+Containerd+MIG/generatedCSECommand Regenerated snapshot for new CSE cmd template
aks-node-controller/parser/testdata/AKSUbuntu2204+CloudProviderOverrides/generatedCSECommand New snapshot for new template output
aks-node-controller/parser/testdata/AKSUbuntu2204+China/generatedCSECommand Regenerated snapshot for new CSE cmd template
pkg/agent/testdata/MarinerV2+Kata/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/AzureLinuxV2+Kata/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/AzureLinuxV3+Kata/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/AKSUbuntu2204+China/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/AKSUbuntu2204+cgroupv2/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/AKSUbuntu2204+ootcredentialprovider/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/AKSUbuntu2204+SecurityProfile/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/AKSUbuntu2204+SSHStatusOn/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/AKSUbuntu2204+SSHStatusOff/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/AKSUbuntu2404+NetworkPolicy/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/AKSUbuntu2404+Teleport/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/CustomizedImage/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/CustomizedImageKata/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/CustomizedImageLinuxGuard/CustomData Regenerated snapshot (custom data gzip payload changed)
pkg/agent/testdata/Flatcar/CustomData.inner Regenerated snapshot (embedded gzip payload changed)
pkg/agent/testdata/ACL/CustomData.inner Regenerated snapshot (embedded gzip payload changed)

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +266 to +337
try {
if ($certEndpointMode -eq "legacy") {
$uri = 'http://168.63.129.16/machine?comp=acmspackage&type=cacertificates&ext=json'
$rawData = Retry-Command -Command 'Invoke-WebRequest' -Args @{Uri=$uri; UseBasicParsing=$true} -Retries 5 -RetryDelaySeconds 10
} catch {
Set-ExitCode -ExitCode $global:WINDOWS_CSE_ERROR_DOWNLOAD_CA_CERTIFICATES -ErrorMessage "Failed to download CA certificates rawdata. Error: $_"
$caCerts = ($rawData.Content) | ConvertFrom-Json
if ($null -eq $caCerts -or $null -eq $caCerts.Certificates -or $caCerts.Certificates.Length -eq 0) {
Write-Log "Warning: CA certificates rawdata is empty for legacy endpoint"
return $false
}

foreach ($certificate in $caCerts.Certificates) {
$name = $certificate.Name
$certFilePath = Join-Path $caFolder $name
Write-Log "Write certificate $name to $certFilePath"
$certificate.CertBody > $certFilePath
}

return $true
}

Write-Log "Convert CA certificates rawdata"
$caCerts=($rawData.Content) | ConvertFrom-Json
if ([string]::IsNullOrEmpty($caCerts)) {
Set-ExitCode -ExitCode $global:WINDOWS_CSE_ERROR_EMPTY_CA_CERTIFICATES -ErrorMessage "CA certificates rawdata is empty"
$optInUri = 'http://168.63.129.16/acms/isOptedInForRootCerts'
$optInResponse = Retry-Command -Command 'Invoke-WebRequest' -Args @{Uri=$optInUri; UseBasicParsing=$true} -Retries 5 -RetryDelaySeconds 10
if (($optInResponse.Content -notmatch 'IsOptedInForRootCerts=true')) {
Write-Log "Skipping custom cloud root cert installation because IsOptedInForRootCerts is not true"
return $false
}

$certificates = $caCerts.Certificates
for ($index = 0; $index -lt $certificates.Length ; $index++) {
$name=$certificates[$index].Name
$certFilePath = Join-Path $caFolder $name
Write-Log "Write certificate $name to $certFilePath"
$certificates[$index].CertBody > $certFilePath
$operationRequestTypes = @("operationrequestsroot", "operationrequestsintermediate")
$downloadedAny = $false

foreach ($requestType in $operationRequestTypes) {
$operationRequestUri = "http://168.63.129.16/machine?comp=acmspackage&type=$requestType&ext=json"
$operationResponse = Retry-Command -Command 'Invoke-WebRequest' -Args @{Uri=$operationRequestUri; UseBasicParsing=$true} -Retries 5 -RetryDelaySeconds 10
$operationJson = ($operationResponse.Content) | ConvertFrom-Json

if ($null -eq $operationJson -or $null -eq $operationJson.OperationRequests) {
Write-Log "Warning: no operation requests found for $requestType"
continue
}

foreach ($operation in $operationJson.OperationRequests) {
$resourceFileName = $operation.ResouceFileName
if ([string]::IsNullOrEmpty($resourceFileName)) {
continue
}

$resourceType = [IO.Path]::GetFileNameWithoutExtension($resourceFileName)
$resourceExt = [IO.Path]::GetExtension($resourceFileName).TrimStart('.')
$resourceUri = "http://168.63.129.16/machine?comp=acmspackage&type=$resourceType&ext=$resourceExt"

$certContentResponse = Retry-Command -Command 'Invoke-WebRequest' -Args @{Uri=$resourceUri; UseBasicParsing=$true} -Retries 5 -RetryDelaySeconds 10
if ([string]::IsNullOrEmpty($certContentResponse.Content)) {
Write-Log "Warning: empty certificate content for $resourceFileName"
continue
}

$certFilePath = Join-Path $caFolder $resourceFileName
Write-Log "Write certificate $resourceFileName to $certFilePath"
$certContentResponse.Content > $certFilePath
$downloadedAny = $true
}
}

if (-not $downloadedAny) {
Write-Log "Warning: no CA certificates were downloaded in rcv1p mode"
}

return $downloadedAny
}
catch {
# Catch all exceptions in this function. NOTE: exit cannot be caught.
Set-ExitCode -ExitCode $global:WINDOWS_CSE_ERROR_GET_CA_CERTIFICATES -ErrorMessage $_
Write-Log "Warning: failed to retrieve CA certificates. Error: $_"
return $false
@rchincha rchincha force-pushed the origin/rchinchani/rcv1p-2 branch from 44ff9ee to a0a1307 Compare March 18, 2026 21:11
Copilot AI review requested due to automatic review settings March 18, 2026 22:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to unify AKS custom-cloud CA certificate bootstrap behavior (legacy vs “rcv1p/operation-requests” style flows) and adds a Windows scheduled task to periodically refresh custom-cloud CA certificates.

Changes:

  • Adds Windows CA refresh scheduled task registration and introduces location-based endpoint-mode selection (legacy vs rcv1p).
  • Refactors Windows CA certificate retrieval to support both endpoint modes and opt-in gating for rcv1p.
  • Simplifies Linux custom-cloud init script selection by consolidating onto init-aks-custom-cloud.sh and removing older variants; updates generated testdata accordingly.

Reviewed changes

Copilot reviewed 93 out of 99 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
staging/cse/windows/kubernetesfunc.ps1 Adds CA refresh scheduled task and endpoint-mode-aware Get-CACertificates implementation.
pkg/agent/variables.go Simplifies how initAKSCustomCloud is added to Linux cloud-init variables.
pkg/agent/testdata/MarinerV2+Kata/CustomData Updates expected CustomData snapshot (generated content changed).
pkg/agent/testdata/Flatcar/CustomData.inner Updates expected Flatcar CustomData snapshot (generated content changed).
pkg/agent/testdata/CustomizedImageLinuxGuard/CustomData Updates expected CustomData snapshot (generated content changed).
pkg/agent/testdata/CustomizedImageKata/CustomData Updates expected CustomData snapshot (generated content changed).
pkg/agent/testdata/CustomizedImage/CustomData Updates expected CustomData snapshot (generated content changed).
pkg/agent/testdata/AzureLinuxV3+Kata/CustomData Updates expected CustomData snapshot (generated content changed).
pkg/agent/testdata/AzureLinuxV2+Kata/CustomData Updates expected CustomData snapshot (generated content changed).
pkg/agent/testdata/AKSWindows23H2Gen2+NextGenNetworkingNoConfig/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows23H2Gen2+NextGenNetworkingDisabled/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows23H2Gen2+NextGenNetworking/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+ootcredentialprovider/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+SecurityProfile/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+ManagedIdentity/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+KubeletServingCertificateRotation/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+KubeletClientTLSBootstrapping/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+K8S119/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+K8S119+FIPS/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+K8S119+CSI/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+K8S118/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+K8S117/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+K8S116/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+EnablePrivateClusterHostsConfigAgent/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+CustomVnet/CustomData Updates expected Windows CustomData snapshot (calls/refresh task additions).
pkg/agent/testdata/AKSWindows2019+CustomCloud/CustomData Updates expected Windows CustomData snapshot (new Get-CACertificates call form + refresh task).
pkg/agent/testdata/AKSWindows2019+CustomCloud+ootcredentialprovider/CustomData Updates expected Windows CustomData snapshot (new Get-CACertificates call form + refresh task).
pkg/agent/testdata/AKSUbuntu2404+Teleport/CustomData Updates expected Ubuntu CustomData snapshot (generated content changed).
pkg/agent/testdata/AKSUbuntu2404+NetworkPolicy/CustomData Updates expected Ubuntu CustomData snapshot (generated content changed).
pkg/agent/testdata/AKSUbuntu2204+ootcredentialprovider/CustomData Updates expected Ubuntu CustomData snapshot (generated content changed).
pkg/agent/testdata/AKSUbuntu2204+cgroupv2/CustomData Updates expected Ubuntu CustomData snapshot (generated content changed).
pkg/agent/testdata/AKSUbuntu2204+SecurityProfile/CustomData Updates expected Ubuntu CustomData snapshot (generated content changed).
pkg/agent/testdata/AKSUbuntu2204+SSHStatusOn/CustomData Updates expected Ubuntu CustomData snapshot (generated content changed).
pkg/agent/testdata/AKSUbuntu2204+China/CustomData Updates expected Ubuntu CustomData snapshot (generated content changed).
pkg/agent/testdata/ACL/CustomData.inner Updates expected ACL CustomData snapshot (generated content changed).
pkg/agent/const.go Consolidates custom-cloud init script constants to a single script.
parts/windows/kuberneteswindowssetup.ps1 Updates Windows setup flow to call Get-CACertificates with location and registers CA refresh scheduled task.
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-operation-requests.sh Removes operation-requests-specific Linux init script (consolidation).
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-operation-requests-mariner.sh Removes Mariner/AzureLinux operation-requests init script (consolidation).
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-mariner.sh Removes Mariner/AzureLinux legacy init script variant (consolidation).
aks-node-controller/parser/templates/cse_cmd.sh.gtpl Adds a LOCATION shell variable in the generated CSE command template.
aks-node-controller/parser/helper.go Factors out a shared getCloudLocation helper and reuses it in getCloudTargetEnv.

You can also share your feedback on Copilot code review. Take the survey.

Get-CACertificates
{{end}}

Get-CACertificates -Location $Location
Comment on lines 344 to +346
catch {
# Catch all exceptions in this function. NOTE: exit cannot be caught.
Set-ExitCode -ExitCode $global:WINDOWS_CSE_ERROR_GET_CA_CERTIFICATES -ErrorMessage $_
Write-Log "Warning: failed to retrieve CA certificates. Error: $_"
return $false
@rchincha rchincha changed the title feat(custom-cloud): unify cert bootstrap flow and add Windows CA refresh task feat(rcv1p): unify cert bootstrap flow and add Windows CA refresh task Mar 19, 2026
@rchincha rchincha force-pushed the origin/rchinchani/rcv1p-2 branch from 2b3c1d6 to e19a19b Compare March 19, 2026 00:51
Copilot AI review requested due to automatic review settings March 19, 2026 01:00
@rchincha rchincha force-pushed the origin/rchinchani/rcv1p-2 branch from e19a19b to d41856f Compare March 19, 2026 01:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR unifies the AKS custom cloud CA certificate bootstrap logic to a single flow and adds a Windows scheduled task to periodically refresh custom cloud CA certificates. It also updates Linux/customdata generation and test snapshots to reflect the new wiring.

Changes:

  • Add Windows scheduled task registration for daily CA certificate refresh and introduce a location-based cert endpoint mode selector.
  • Simplify Linux custom cloud init script selection by standardizing on init-aks-custom-cloud.sh, plus add wiring/tests for refresh-mode arguments.
  • Update aks-node-controller template to export LOCATION, and regenerate CustomData snapshot test artifacts.

Reviewed changes

Copilot reviewed 95 out of 101 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
staging/cse/windows/kubernetesfunc.tests.ps1 Adds Pester coverage for cert endpoint mode selection, scheduled task registration, and CA retrieval behavior.
staging/cse/windows/kubernetesfunc.ps1 Implements unified Windows CA retrieval logic with legacy/rcv1p modes and registers a daily refresh scheduled task.
spec/parts/linux/cloud-init/artifacts/init_aks_custom_cloud_spec.sh Adds ShellSpec assertions to validate refresh-mode argument parsing/wiring in the Linux init script.
pkg/agent/variables.go Changes how initAKSCustomCloud is injected into Linux cloud-init data.
pkg/agent/const.go Removes per-cloud custom init script constants and standardizes on init-aks-custom-cloud.sh.
parts/windows/kuberneteswindowssetup.ps1 Wires CA retrieval call and registers the Windows CA refresh scheduled task during BasePrep.
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-operation-requests.sh Removed (operation-requests variant no longer used).
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-operation-requests-mariner.sh Removed (operation-requests Mariner/AzureLinux variant no longer used).
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-mariner.sh Removed (Mariner/AzureLinux legacy variant no longer used).
aks-node-controller/parser/templates/cse_cmd.sh.gtpl Exports LOCATION into the CSE environment for downstream scripts.
aks-node-controller/parser/helper.go Adds a helper to normalize location and reuses it in cloud target env detection.
pkg/agent/testdata/MarinerV2+Kata/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/Flatcar/CustomData.inner Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/CustomizedImageLinuxGuard/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/CustomizedImageKata/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/CustomizedImage/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/AzureLinuxV3+Kata/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/AzureLinuxV2+Kata/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/AKSWindows23H2Gen2+NextGenNetworkingNoConfig/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows23H2Gen2+NextGenNetworkingDisabled/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows23H2Gen2+NextGenNetworking/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+ootcredentialprovider/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+SecurityProfile/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+ManagedIdentity/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+KubeletServingCertificateRotation/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+KubeletClientTLSBootstrapping/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+K8S119/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+K8S119+FIPS/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+K8S119+CSI/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+K8S118/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+K8S117/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+K8S116/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+EnablePrivateClusterHostsConfigAgent/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+CustomVnet/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+CustomCloud/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSWindows2019+CustomCloud+ootcredentialprovider/CustomData Regenerated CustomData snapshot due to Windows CA refresh task wiring.
pkg/agent/testdata/AKSUbuntu2404+Teleport/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/AKSUbuntu2404+NetworkPolicy/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/AKSUbuntu2204+ootcredentialprovider/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/AKSUbuntu2204+cgroupv2/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/AKSUbuntu2204+SecurityProfile/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/AKSUbuntu2204+SSHStatusOn/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/AKSUbuntu2204+China/CustomData Regenerated CustomData snapshot due to init/custom cloud wiring changes.
pkg/agent/testdata/ACL/CustomData.inner Regenerated CustomData snapshot due to init/custom cloud wiring changes.

You can also share your feedback on Copilot code review. Take the survey.

@rchincha rchincha force-pushed the origin/rchinchani/rcv1p-2 branch from d41856f to 18ba549 Compare March 19, 2026 20:13
Copilot AI review requested due to automatic review settings March 19, 2026 22:07
@rchincha rchincha force-pushed the origin/rchinchani/rcv1p-2 branch from 18ba549 to e94c465 Compare March 19, 2026 22:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates AKS custom cloud certificate bootstrapping to use a single unified flow and adds a Windows scheduled task for periodic custom cloud CA refresh.

Changes:

  • Added Windows CA refresh task registration plus new logic to select cert retrieval mode and opt-in gating.
  • Simplified Linux custom cloud init script wiring by removing legacy “operation-requests” variants and normalizing location for refresh mode.
  • Added/updated tests and refreshed golden testdata outputs to reflect new custom data content.

Reviewed changes

Copilot reviewed 95 out of 101 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
staging/cse/windows/kubernetesfunc.tests.ps1 Adds Pester coverage for endpoint-mode selection, task registration behavior, and CA retrieval failure handling.
staging/cse/windows/kubernetesfunc.ps1 Implements endpoint-mode derivation, opt-in gating, CA retrieval paths, and a Windows scheduled task for refresh.
spec/parts/linux/cloud-init/artifacts/init_aks_custom_cloud_spec.sh Adds ShellSpec checks to ensure init script wiring for ca-refresh mode and LOCATION usage.
pkg/agent/variables.go Simplifies init script selection and updates how custom cloud init script is injected into cloud-init data.
pkg/agent/const.go Removes now-unused custom-cloud init script constants; keeps unified init script constant.
parts/windows/kuberneteswindowssetup.ps1 Updates Windows setup to call Get-CACertificates with Location and conditionally register refresh task.
aks-node-controller/parser/templates/cse_cmd.sh.gtpl Adds LOCATION variable for downstream scripts during custom cloud provisioning.
aks-node-controller/parser/helper.go Adds getCloudLocation helper and reuses it for cloud target env detection.
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-operation-requests.sh Removes legacy operation-requests init script (superseded by unified script).
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-operation-requests-mariner.sh Removes legacy Mariner operation-requests init script (superseded by unified script).
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-mariner.sh Removes legacy Mariner init script variant (superseded by unified script).
pkg/agent/testdata/MarinerV2+Kata/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/Flatcar/CustomData.inner Updates golden ignition/customData payload for unified custom cloud init content.
pkg/agent/testdata/CustomizedImageLinuxGuard/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/CustomizedImageKata/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/CustomizedImage/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/AzureLinuxV3+Kata/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/AzureLinuxV2+Kata/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/AKSWindows23H2Gen2+NextGenNetworkingNoConfig/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows23H2Gen2+NextGenNetworkingDisabled/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows23H2Gen2+NextGenNetworking/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+ootcredentialprovider/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+SecurityProfile/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+ManagedIdentity/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+KubeletServingCertificateRotation/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+KubeletClientTLSBootstrapping/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+K8S119/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+K8S119+FIPS/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+K8S119+CSI/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+K8S118/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+K8S117/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+K8S116/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+EnablePrivateClusterHostsConfigAgent/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+CustomVnet/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+CustomCloud/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSWindows2019+CustomCloud+ootcredentialprovider/CustomData Updates golden Windows customData to pass Location to CA cert retrieval + refresh task gating.
pkg/agent/testdata/AKSUbuntu2404+Teleport/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/AKSUbuntu2404+NetworkPolicy/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/AKSUbuntu2204+ootcredentialprovider/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/AKSUbuntu2204+cgroupv2/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/AKSUbuntu2204+SecurityProfile/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/AKSUbuntu2204+SSHStatusOn/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/AKSUbuntu2204+SSHStatusOff/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/AKSUbuntu2204+China/CustomData Updates golden customData to match unified custom cloud init content.
pkg/agent/testdata/ACL/CustomData.inner Updates golden ignition/customData payload for unified custom cloud init content.
Comments suppressed due to low confidence (7)

staging/cse/windows/kubernetesfunc.ps1:1

  • Get-CACertificates used to fail fast via Set-ExitCode on retrieval/parse errors, but now returns $false (and logs warnings) for a wide range of failure cases. Because call sites in the generated setup scripts invoke Get-CACertificates -Location $Location without checking the return value, this can silently proceed without required CA material and lead to harder-to-diagnose TLS failures later in provisioning. Consider restoring fatal behavior for “expected-to-install” scenarios (e.g., legacy mode, or rcv1p when opted-in), or have callers check the return value and invoke Set-ExitCode when it’s $false in those modes.
    staging/cse/windows/kubernetesfunc.ps1:1
  • Get-CACertificates used to fail fast via Set-ExitCode on retrieval/parse errors, but now returns $false (and logs warnings) for a wide range of failure cases. Because call sites in the generated setup scripts invoke Get-CACertificates -Location $Location without checking the return value, this can silently proceed without required CA material and lead to harder-to-diagnose TLS failures later in provisioning. Consider restoring fatal behavior for “expected-to-install” scenarios (e.g., legacy mode, or rcv1p when opted-in), or have callers check the return value and invoke Set-ExitCode when it’s $false in those modes.
    pkg/agent/variables.go:1
  • This change removes the previous cs.IsAKSCustomCloud() guard and injects the custom cloud init script into cloudInitData unconditionally. That can increase customData size for all clusters (risking platform limits) and may introduce unintended side effects if any downstream template writes/executes this script outside custom cloud. Recommend reinstating the custom cloud guard (and only setting initAKSCustomCloud when IsAKSCustomCloud() is true), while still using the unified initAKSCustomCloudScript for all custom clouds.
    staging/cse/windows/kubernetesfunc.ps1:1
  • $resourceFileName is used directly to build a path under C:\ca. If the upstream response ever contains path separators (e.g., ..\foo or nested paths), this can write outside the intended directory. Prefer sanitizing to a basename (e.g., using Split-Path -Leaf or [IO.Path]::GetFileName($resourceFileName)) before Join-Path, and consider rejecting names containing directory traversal characters.
    staging/cse/windows/kubernetesfunc.ps1:1
  • $resourceFileName is used directly to build a path under C:\ca. If the upstream response ever contains path separators (e.g., ..\foo or nested paths), this can write outside the intended directory. Prefer sanitizing to a basename (e.g., using Split-Path -Leaf or [IO.Path]::GetFileName($resourceFileName)) before Join-Path, and consider rejecting names containing directory traversal characters.
    staging/cse/windows/kubernetesfunc.ps1:1
  • The new rcv1p operation-requests flow is non-trivial (multiple requests, JSON shape assumptions, per-item content downloads, and $downloadedAny aggregation), but the added Pester tests only cover legacy mode and the “throws returns false” path. Add tests that (1) exercise the rcv1p path end-to-end with mocked Retry-Command returning operation requests and cert bodies, and (2) verify behavior when operation requests are empty/invalid (ensuring the function returns $false and logs expected warnings).
    pkg/agent/variables.go:1
  • The PR description still contains placeholder text (Fixes # with no linked issue and no explanation of “what/why”). Please update the PR description to summarize the behavior change (unified bootstrap + Windows refresh task) and link the relevant issue or remove the placeholder.

@rchincha rchincha force-pushed the origin/rchinchani/rcv1p-2 branch from e94c465 to f20d5b8 Compare March 19, 2026 23:28
Copilot AI review requested due to automatic review settings March 20, 2026 06:42
@rchincha rchincha force-pushed the origin/rchinchani/rcv1p-2 branch from f20d5b8 to b53f240 Compare March 20, 2026 06:42
@rchincha rchincha force-pushed the origin/rchinchani/rcv1p-2 branch from 49987e4 to 6c51cf2 Compare March 26, 2026 17:55
@rchincha rchincha requested a review from SriHarsha001 as a code owner March 26, 2026 17:55
@rchincha rchincha force-pushed the origin/rchinchani/rcv1p-2 branch from 6c51cf2 to 2bc437b Compare March 26, 2026 19:55
Copilot AI review requested due to automatic review settings March 26, 2026 19:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

@rchincha rchincha requested a review from Copilot March 26, 2026 22:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Comment on lines +159 to +162

echo "$cert_content" > "/root/AzureCACertificates/$cert_filename"
echo "Successfully saved certificate: $cert_filename"
done
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

process_cert_operations writes downloaded cert content to /root/AzureCACertificates/$cert_filename, but /root/AzureCACertificates is not guaranteed to exist at this point (it’s only created later in install_certs_to_trust_store). This can silently drop certificates while still logging “Successfully saved certificate”. Ensure the directory exists before writing (or create it once near the top before both legacy/rcv1p retrieval paths).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Comment on lines +158 to +161
fi

echo "$cert_content" > "/root/AzureCACertificates/$cert_filename"
echo "Successfully saved certificate: $cert_filename"
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

process_cert_operations writes cert content to /root/AzureCACertificates/$cert_filename, but the directory may not exist yet (it’s only created inside install_certs_to_trust_store, which runs after retrieval). Create the directory before this write to avoid failing the refresh path.

Copilot uses AI. Check for mistakes.
@rchincha
Copy link
Copy Markdown
Author

Linux testing:

https://msazure.visualstudio.com/CloudNativeCompute/_build/results?buildId=158446017&view=artifacts&pathAsName=false&type=publishedArtifacts

Looks like e2e tests use "westus3" (public cloud)

 1. Detects OS correctly (Ubuntu, ACL, AzureLinux)
2. Uses rcv1p cert mode (new endpoint) for all
3. Skips cert install on public cloud (correct — IsOptedInForRootCerts JSON doesn't match the grep)
4. Guards RepoDepot — Ubuntu logs REPO_DEPOT_ENDPOINT empty, skipping; ACL/AzureLinux skip it entirely (not Ubuntu-based)
5. Configures chrony per-OS (chrony for Ubuntu, chronyd for AzureLinux, skip for ACL)
6. Completes successfully — Custom script finished. in all 113 tests that reached CSE
7. Zero apt corruption (0 instances of Malformed entry or exit 99 across all logs)

C:\repos\tmp\a\artifacts3\sl1\Test_Ubuntu2404Gen2_McrChinaCloud\cluster-provision.log
C:\repos\tmp\a\artifacts3\sl2\Test_Ubuntu2404Gen2_McrChinaCloud_Scriptless\cluster-provision.log
C:\repos\tmp\a\artifacts3\sl1\Test_ACL_AzureCNI\cluster-provision.log
C:\repos\tmp\a\artifacts3\sl2\Test_AzureLinuxV3_KubeletCustomConfig_Scriptless\cluster-provision.log

Ubuntu (Test_Ubuntu2404Gen2_McrChinaCloud)
31: Running on Ubuntu
37: cert_endpoint_mode=rcv1p
69: Skipping custom cloud root cert installation because IsOptedInForRootCerts is not true
85: REPO_DEPOT_ENDPOINT empty, skipping Ubuntu RepoDepot initialization
98: systemctl restart chrony

ACL (Test_ACL_AzureCNI)
38: Running on Microsoft Azure Container Linux
44: cert_endpoint_mode=rcv1p
76: Skipping custom cloud root cert installation because IsOptedInForRootCerts is not true
91: Skipping chrony configuration for ACL (PTP clock baked into chronyd...)
2829: Custom script finished

AzureLinux Scriptless (Test_AzureLinuxV3_KubeletCustomConfig_Scriptless)
34: Running on Microsoft Azure Linux
40: cert_endpoint_mode=rcv1p
72: Skipping custom cloud root cert installation because IsOptedInForRootCerts is not true
100: systemctl restart chronyd
2773: Custom script finished.

Ubuntu Scriptless (Test_Ubuntu2404Gen2_McrChinaCloud_Scriptless)
32: Running on Ubuntu
38: cert_endpoint_mode=rcv1p
70: Skipping custom cloud root cert installation because IsOptedInForRootCerts is not true
86: REPO_DEPOT_ENDPOINT empty, skipping Ubuntu RepoDepot initialization
99: systemctl restart chrony
2865: Custom script finished

@rchincha
Copy link
Copy Markdown
Author

Windows testing:

Also westus3?

 https://msazure.visualstudio.com/CloudNativeCompute/_build/results?buildId=158446024&view=artifacts&pathAsName=false&type=publishedArtifacts

 

  1. kuberneteswindowssetup.ps1:436 — Get-CACertificates -Location $Location is now outside the {{if IsAKSCustomCloud}} block (which closes at line 434),
    so it runs for all clouds
  2. kuberneteswindowssetup.ps1:481-487 — Register-CACertificatesRefreshTask also outside the guard, with backward-compat check for
    Should-InstallCACertificatesRefreshTask
  3. kubernetesfunc.ps1:308-309 — On public cloud with Location=westus3, Get-CustomCloudCertEndpointModeFromLocation returns rcv1p
  4. kubernetesfunc.ps1:334 — rcv1p mode checks IsOptedInForRootCerts=true (same grep mismatch as Linux — benign on public cloud, skips cert install)
  5. kubernetesfunc.ps1:286 — Should-InstallCACertificatesRefreshTask also checks opt-in, returns $false on public cloud → refresh task not registered
    (correct)

 C:\repos\tmp\a\artifacts_win2\cse-extracted\kubernetesfunc.ps1
C:\repos\tmp\a\artifacts_win2\log_104.txt
C:\repos\tmp\a\artifacts_win2\log_117.txt
C:\repos\tmp\a\artifacts_win2\log_138.txt
C:\repos\Agentbaker\parts\windows\kuberneteswindowssetup.ps

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.

aks-node-assistant bot and others added 15 commits April 6, 2026 11:07
Co-authored-by: Jane Jung <janejung@microsoft.com>
Co-authored-by: janenotjung-hue <107402425+janenotjung-hue@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: aks-node-assistant[bot] <190555641+aks-node-assistant[bot]@users.noreply.github.com>
https://eng.ms/docs/products/onecert-certificates-key-vault-and-dsms/onecert-customer-guide/autorotationandecr/overviewrcv
https://eng.ms/docs/products/onecert-certificates-key-vault-and-dsms/onecert-customer-guide/autorotationandecr/rcv1ptsg

cse_cmd.sh.gtpl: derive cert endpoint mode from target cloud and always run custom-cloud init script.
cse_cmd.sh: same mode logic as template; remove LOCATION export.
init-aks-custom-cloud.sh: merged legacy + operation-requests logic into one script with distro-aware cert install paths.
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-mariner.sh: removed (merged into unified script).
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-operation-requests.sh: removed (merged into unified script).
parts/linux/cloud-init/artifacts/init-aks-custom-cloud-operation-requests-mariner.sh: removed (merged into unified script).
const.go: keep only unified custom-cloud init script constant.
variables.go: simplify script selection to always use unified init script.
kubernetesfunc.ps1: add location-aware CA retrieval (legacy/rcv1p) and scheduled refresh task registration helper.
kuberneteswindowssetup.ps1: pass location to CA retrieval and register refresh task for custom cloud.
…ndpoint mode handling for legacy and rcv1p regions
The unified init-aks-custom-cloud.sh script (~22KB) pushed Flatcar and ACL
VM customData over Azure's 87,380 character limit, causing 16 E2E failures.

Split the script into two files:
- init-aks-custom-cloud.sh: cert refresh + scheduling (included for all clouds)
- init-aks-custom-cloud-repos.sh: repo depot + chrony (custom cloud only)

The main script sources the repos script at runtime if present. For
non-custom-cloud VMs, only the smaller main script is embedded, reducing
base64(gzip) size from 8,736 to 4,424 chars (-4,312 chars).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants