[kubernetes] Auto-propagate infra StorageClasses to tenant clusters#2095
[kubernetes] Auto-propagate infra StorageClasses to tenant clusters#2095mattia-eleuteri wants to merge 1 commit intocozystack:mainfrom
Conversation
Summary of ChangesHello @mattia-eleuteri, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the management of StorageClasses within tenant Kubernetes clusters by automating their discovery and propagation from the underlying infrastructure. It transitions from a rigid, single static StorageClass configuration to a dynamic system that identifies and makes available all relevant infrastructure StorageClasses. This change not only streamlines deployment but also enhances security by enforcing that tenants can only utilize these propagated classes, ensuring consistency and preventing unauthorized storage provisioning. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (11)
📝 WalkthroughWalkthroughThis PR implements auto-propagation of infrastructure StorageClasses to tenant Kubernetes clusters using Helm Changes
Sequence Diagram(s)sequenceDiagram
participant MC as Management<br/>Cluster
participant HR as HelmRelease<br/>Template
participant TC as Tenant<br/>Cluster
participant CSI as kubevirt-csi<br/>driver
participant PVC as Persistent<br/>Volume
MC->>HR: Reconcile CSI HelmRelease
HR->>MC: lookup all StorageClasses
MC-->>HR: Return discovered SC list
HR->>HR: Build storageClasses dict<br/>(name, infraSC, default)
HR->>HR: Generate infraStorageClassEnforcement<br/>allowList
HR->>TC: Apply StorageClass resources<br/>(per infra SC)
HR->>TC: Apply driver-config ConfigMap<br/>with enforcement allowList
TC->>CSI: Load driver config
CSI->>CSI: Validate PVC requests<br/>against allowList
Note over CSI: Only propagated SCs allowed
TC->>PVC: Create/bind with<br/>storageClassName
CSI->>PVC: Provision volume<br/>via kubevirt provisioner
sequenceDiagram
participant Admin as Admin
participant MigJob as Migration<br/>Job v35→36
participant HR as HelmRelease<br/>Values Secret
participant TC as Tenant<br/>Cluster
participant PV as Persistent<br/>Volume
participant PVC as Persistent<br/>Volume Claim
Admin->>MigJob: Trigger migration v35→36
MigJob->>HR: Find all HR values secrets
MigJob->>HR: Read & remove storageClass key
MigJob->>HR: Patch secret with updated YAML
Note over HR: idempotent removal
MigJob->>TC: Extract tenant kubeconfig
MigJob->>TC: List PVCs (storageClassName=kubevirt)
TC-->>MigJob: Return PVC list
loop Per PVC
MigJob->>PV: Get bound PV & extract CSI attrs
MigJob->>PV: Set Retain reclaim policy
MigJob->>PVC: Delete PVC
MigJob->>PV: Patch storageClassName &<br/>remove claimRef
MigJob->>PVC: Recreate PVC with<br/>discovered SC name
PVC->>PV: Rebind to PV
MigJob->>PV: Restore reclaim policy
end
MigJob->>MigJob: Update cozystack-version<br/>configmap to v36
MigJob-->>Admin: Migration complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request automates the discovery and propagation of infrastructure StorageClasses to tenant Kubernetes clusters, replacing the previous static storageClass parameter. The changes involve modifications to the kubevirt-csi-node and kubernetes charts, along with the introduction of a new driver-config.yaml ConfigMap for enforcement. The review focuses on ensuring the correctness and maintainability of the implemented changes, particularly around the handling of StorageClasses and the enforcement mechanism.
| allowAll: false | ||
| allowDefault: false | ||
| allowList: | ||
| {{- $storageClasses := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }} |
There was a problem hiding this comment.
There was a problem hiding this comment.
Not an issue. FluxCD's helm-controller runs with broad cluster-scoped permissions and can already list non-namespaced resources like StorageClasses. This is how all lookup-based templates work in this project.
| {{- range $name, $mapping := .Values.storageClasses }} | ||
| - {{ $mapping.infraStorageClass | quote }} | ||
| {{- end }} |
There was a problem hiding this comment.
There was a problem hiding this comment.
Not an issue. The storageClasses values are populated by lookup in the parent template (csi.yaml), not from user input. The names come directly from the Kubernetes API — there is no injection vector here.
| {{- $storageClasses := dict }} | ||
| {{- range $sc := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }} | ||
| {{- $isDefault := false }} | ||
| {{- if $sc.metadata.annotations }} | ||
| {{- $isDefault = eq (index $sc.metadata.annotations "storageclass.kubernetes.io/is-default-class" | default "false") "true" }} | ||
| {{- end }} | ||
| {{- $_ := set $storageClasses $sc.metadata.name (dict "infraStorageClass" $sc.metadata.name "default" $isDefault) }} | ||
| {{- end }} | ||
| {{- if not $storageClasses }} | ||
| {{- $_ := set $storageClasses "replicated" (dict "infraStorageClass" "replicated" "default" true) }} | ||
| {{- end }} | ||
| {{- $storageClasses | toYaml | nindent 6 }} |
There was a problem hiding this comment.
There was a problem hiding this comment.
Style-only suggestion. The template logic is straightforward and follows the same patterns used elsewhere in the codebase. Declining to add extra comments here.
| name: {{ $name }} | ||
| {{- if $mapping.default }} | ||
| annotations: | ||
| storageclass.kubernetes.io/is-default-class: "true" | ||
| {{- end }} |
There was a problem hiding this comment.
Consider adding a default value for the storageclass.kubernetes.io/is-default-class annotation in case it's not present. This can prevent unexpected behavior if the annotation is missing.
name: {{ $name }}
{{- if eq $mapping.default true }}
annotations:
storageclass.kubernetes.io/is-default-class: "true"
{{- end }}There was a problem hiding this comment.
The current code already handles this correctly. The {{- if eq $mapping.default true }} conditional only renders the annotation block when .default is explicitly true. When the annotation is absent from the source StorageClass, $mapping.default will be false and the annotation block is simply not rendered — which is the desired behavior.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/apps/kubernetes/templates/csi/driver-config.yaml`:
- Around line 6-17: The template directly uses (lookup ...).items which can be
nil and prevents the fallback allowList entry "replicated" from rendering;
update the infraStorageClassEnforcement block to first assign the lookup result
to a variable (e.g. $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "")
and then safely test $scList and $scList.items (or check len $scList.items)
before ranging into $storageClasses, so that when the lookup is nil the template
falls back to adding - "replicated" under allowList; adjust variable names shown
($storageClasses, $scList, allowList) accordingly.
In `@packages/apps/kubernetes/templates/helmreleases/csi.yaml`:
- Around line 30-42: The template directly accesses (lookup "storage.k8s.io/v1"
"StorageClass" "" "").items which will null-deref during helm template or when
the cluster is unavailable; update the logic around the lookup call (the
$storageClasses population loop) to guard the result by using default/get (or
assign the lookup result to a variable, then use default to an empty dict/list
and use get to read .items) so the loop over storage classes only runs when
items exist and the fallback that sets "replicated" as default still executes;
adjust references to the lookup result, .items, and the $storageClasses
population block accordingly and keep the final $storageClasses | toYaml |
nindent 6 unchanged.
In `@packages/system/kubevirt-csi-node/templates/deploy.yaml`:
- Line 267: The ConfigMap renders a bare key "infraClusterNamespace:" which
produces a null value; update the template in deploy.yaml so the key either
emits an explicit empty string or is omitted when unset—e.g. change the
ConfigMap data entry for infraClusterNamespace to render infraClusterNamespace:
"" or wrap it in a conditional (if .Values.infraClusterNamespace) to only
include the key when a non-empty value is provided; locate the
infraClusterNamespace entry in the ConfigMap block of
packages/system/kubevirt-csi-node/templates/deploy.yaml and apply one of these
fixes.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
packages/apps/kubernetes/templates/csi/driver-config.yamlpackages/apps/kubernetes/templates/helmreleases/csi.yamlpackages/apps/kubernetes/values.schema.jsonpackages/apps/kubernetes/values.yamlpackages/system/kubevirt-csi-node/templates/deploy.yamlpackages/system/kubevirt-csi-node/values.yaml
💤 Files with no reviewable changes (2)
- packages/apps/kubernetes/values.schema.json
- packages/apps/kubernetes/values.yaml
| infraStorageClassEnforcement: | | ||
| allowAll: false | ||
| allowDefault: false | ||
| allowList: | ||
| {{- $storageClasses := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }} | ||
| {{- if $storageClasses }} | ||
| {{- range $storageClasses }} | ||
| - {{ .metadata.name | quote }} | ||
| {{- end }} | ||
| {{- else }} | ||
| - "replicated" | ||
| {{- end }} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find and examine the file
find . -path "*/packages/apps/kubernetes/templates/csi/driver-config.yaml" -type fRepository: cozystack/cozystack
Length of output: 122
🏁 Script executed:
# Check the repository structure
ls -la packages/apps/ 2>/dev/null || echo "packages/apps/ not found"Repository: cozystack/cozystack
Length of output: 1436
🏁 Script executed:
# If the file exists, let's see its full content with line numbers
cat -n packages/apps/kubernetes/templates/csi/driver-config.yaml 2>/dev/null | head -30Repository: cozystack/cozystack
Length of output: 592
🏁 Script executed:
# Check if charts/ directory exists with vendored charts
find packages/apps/kubernetes -name "Chart.yaml" -o -name "charts" -type d 2>/dev/null | head -10Repository: cozystack/cozystack
Length of output: 130
🏁 Script executed:
# Check the Chart.yaml to understand ownership
cat -n packages/apps/kubernetes/Chart.yamlRepository: cozystack/cozystack
Length of output: 348
🏁 Script executed:
# Check what's in the charts/ directory (vendored charts)
ls -la packages/apps/kubernetes/charts/Repository: cozystack/cozystack
Length of output: 257
🏁 Script executed:
# Check if there are other driver-config.yaml files in vendored charts
find packages/apps/kubernetes/charts -name "driver-config.yaml" 2>/dev/nullRepository: cozystack/cozystack
Length of output: 45
🌐 Web query:
Helm lookup function nil return behavior .items access error
💡 Result:
In Helm, lookup can legitimately give you a value where .items isn’t safe to access, for two common reasons:
-
You asked for a single object, not a list
If you pass a non-emptyname,lookupreturns a single object (dict/map), not a list—so there is no.itemsfield to range over. To get a list (with.items), pass an empty name"". The Helm docs explicitly show that only “list” lookups have an.itemsfield. [1] -
lookupreturned an “empty value” (often seen as nil) because Helm didn’t/ couldn’t talk to the cluster
Helm is not supposed to contact the API server duringhelm templateor most--dry-runmodes; in those caseslookupreturns an empty value (often observed asnil/empty), and then(...).itemstriggers a nil-pointer/field-access error. Use--dry-run=serverwhen you need live lookups, or ensure you’re running against a real cluster connection. [1][2][4]
Safe patterns
Range safely over a list lookup:
{{- $list := (lookup "v1" "ConfigMap" .Release.Namespace "" | default (dict "items" (list))) -}}
{{- range $cm := $list.items -}}
{{/* ... */}}
{{- end -}}
default treats nil as “empty”, so it will substitute your fallback dict. [3]
Or guard with if:
{{- $list := lookup "v1" "ConfigMap" .Release.Namespace "" -}}
{{- if $list -}}
{{- range $cm := $list.items -}} ... {{- end -}}
{{- end -}}
If you share your exact lookup call (especially whether name is "" or not) and the command/environment (Helm vs ArgoCD/Harness/CI), I can point to the specific cause in your case.
🌐 Web query:
Helm template default filter dict list functions syntax
💡 Result:
default (aka “fallback”) syntax
- Direct call:
{{ default "foo" .Values.bar }} - Pipe form (same thing):
{{ .Values.bar | default "foo" }}
defaultreturns the second value unless it’s “empty” (e.g.,0,"",[],{},false, ornil). [1]
dict syntax
- Create a map from key/value pairs:
{{- $m := dict "name" "alice" "enabled" true -}}[1] - Empty map:
{{- $empty := dict -}}(often written as(dict)when used as an argument) [1][2]
list syntax
- Create a list:
{{- $l := list 1 2 3 -}}→[1 2 3][1] - When passing a list as a single argument, wrap it:
{{ someFunc (list "a" "b") }}[1]
Common combined pattern (avoids nil errors)
{{- if index (default (dict) .Values.outer.inner) "someKey" -}}
...
{{- end -}}
Meaning: if .Values.outer.inner is nil/empty, replace it with an empty dict so index ... "someKey" won’t blow up. [2]
Sources: Helm template function list [1]; Stack Overflow explanation of default (dict) usage [2].
Protect .items access on lookup result to ensure fallback list renders.
Directly accessing .items on the lookup call fails when the API lookup returns nil (common in helm template without --dry-run=server), preventing the fallback to "replicated" from rendering.
🔧 Proposed fix
- {{- $storageClasses := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
+ {{- $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "" | default dict }}
+ {{- $storageClasses := (get $scList "items" | default list) }}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/apps/kubernetes/templates/csi/driver-config.yaml` around lines 6 -
17, The template directly uses (lookup ...).items which can be nil and prevents
the fallback allowList entry "replicated" from rendering; update the
infraStorageClassEnforcement block to first assign the lookup result to a
variable (e.g. $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "") and
then safely test $scList and $scList.items (or check len $scList.items) before
ranging into $storageClasses, so that when the lookup is nil the template falls
back to adding - "replicated" under allowList; adjust variable names shown
($storageClasses, $scList, allowList) accordingly.
There was a problem hiding this comment.
Not applicable. These templates are rendered exclusively by FluxCD's helm-controller against a live cluster, never via standalone helm template. The (lookup ...).items pattern is used throughout this codebase:
packages/apps/kubernetes/templates/cluster.yaml:376—(lookup "cluster.x-k8s.io/v1beta1" "MachineSet" ...).itemspackages/apps/kubernetes/templates/cluster.yaml:378—(lookup "infrastructure.cluster.x-k8s.io/v1alpha1" "KubevirtMachineTemplate" ...).itemspackages/system/linstor-scheduler/— 4 occurrences of(lookup ...).items
Adding default dict guards would be inconsistent with the rest of the project and unnecessary for the actual deployment path.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify the claimed usages of (lookup ...).items pattern in the codebase
echo "=== Checking packages/apps/kubernetes/templates/cluster.yaml around lines 376-378 ==="
sed -n '370,385p' packages/apps/kubernetes/templates/cluster.yaml | grep -n "lookup"
echo -e "\n=== Checking packages/system/linstor-scheduler/ for lookup patterns ==="
rg -n '\.items\s*}}' packages/system/linstor-scheduler/ -A 1 -B 1
echo -e "\n=== Finding all occurrences of (lookup ...).items pattern across the repository ==="
rg -nP '\(lookup\s+[^)]+\)\.items' packages/
Length of output: 3626
@mattia-eleuteri, thank you for the clarification! I've verified your claims and confirmed that the (lookup ...).items pattern is indeed used consistently throughout the codebase:
- ✅
packages/apps/kubernetes/templates/cluster.yaml:376— MachineSet lookup - ✅
packages/apps/kubernetes/templates/cluster.yaml:378— KubevirtMachineTemplate lookup - ✅
packages/system/linstor-scheduler/— 4 occurrences in_helpers.tpl
You're absolutely right that adding default dict guards would be inconsistent with the established patterns in this repository. Since FluxCD's helm-controller always renders against a live cluster, the concern about nil lookup results doesn't apply to your deployment model. I appreciate the context!
ʕ •ᴥ•ʔ 📚✨
✏️ Learnings added
Learnt from: mattia-eleuteri
Repo: cozystack/cozystack PR: 2095
File: packages/apps/kubernetes/templates/csi/driver-config.yaml:6-17
Timestamp: 2026-02-24T18:01:06.090Z
Learning: In the cozystack/cozystack repository, Helm templates are rendered exclusively by FluxCD's helm-controller against a live cluster, never via standalone `helm template`. Therefore, the `(lookup ...).items` pattern is safe and used consistently throughout the codebase (e.g., in packages/apps/kubernetes/templates/cluster.yaml and packages/system/linstor-scheduler/). Adding `default dict` guards to protect against nil lookup results is unnecessary and would be inconsistent with project conventions.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: lexfrei
Repo: cozystack/cozystack PR: 2010
File: packages/core/installer/values.yaml:10-26
Timestamp: 2026-02-10T10:44:34.963Z
Learning: In the cozystack/cozystack repository, the Makefile uses `helm template -s` to selectively render templates for different modes (talos, generic, hosted). Because Helm evaluates all templates regardless of `-s` filtering, using `required` checks in mode-specific templates (e.g., generic-mode templates) would break the build for other modes. Therefore, avoid suggesting `required` validation for mode-specific values in Helm templates in this repository.
Learnt from: kvaps
Repo: cozystack/cozystack PR: 1698
File: packages/core/flux-aio/templates/_helpers.tpl:3-3
Timestamp: 2025-12-10T11:40:12.476Z
Learning: Helpers taken from flux-operator (such as kubernetes_envs helpers) should be kept as-is and not modified, even if they appear to have unconventional patterns like using `dict` as a default for array fields. These have been battle-tested upstream.
| storageClasses: | ||
| {{- $storageClasses := dict }} | ||
| {{- range $sc := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }} | ||
| {{- $isDefault := false }} | ||
| {{- if $sc.metadata.annotations }} | ||
| {{- $isDefault = eq (index $sc.metadata.annotations "storageclass.kubernetes.io/is-default-class" | default "false") "true" }} | ||
| {{- end }} | ||
| {{- $_ := set $storageClasses $sc.metadata.name (dict "infraStorageClass" $sc.metadata.name "default" $isDefault) }} | ||
| {{- end }} | ||
| {{- if not $storageClasses }} | ||
| {{- $_ := set $storageClasses "replicated" (dict "infraStorageClass" "replicated" "default" true) }} | ||
| {{- end }} | ||
| {{- $storageClasses | toYaml | nindent 6 }} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Read the actual file to see full context
cat -n packages/apps/kubernetes/templates/helmreleases/csi.yaml | head -50Repository: cozystack/cozystack
Length of output: 2150
🏁 Script executed:
# Check if file exists and get basic info
ls -la packages/apps/kubernetes/templates/helmreleases/csi.yaml
wc -l packages/apps/kubernetes/templates/helmreleases/csi.yamlRepository: cozystack/cozystack
Length of output: 228
🏁 Script executed:
# Search for other usages of lookup in the codebase to understand patterns
rg 'lookup\s*"' packages/apps/kubernetes/templates/ -A 2 -B 1Repository: cozystack/cozystack
Length of output: 10922
Guard lookup before accessing .items to preserve fallback behavior.
When lookup returns nil (e.g., during helm template or cluster unavailability), the direct access to (lookup ...).items will error before the replicated fallback (lines 39-41) can execute. Use default and get to safely access items.
🔧 Proposed fix
- {{- $storageClasses := dict }}
- {{- range $sc := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
+ {{- $storageClasses := dict }}
+ {{- $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "" | default dict }}
+ {{- range $sc := (get $scList "items" | default list) }}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/apps/kubernetes/templates/helmreleases/csi.yaml` around lines 30 -
42, The template directly accesses (lookup "storage.k8s.io/v1" "StorageClass" ""
"").items which will null-deref during helm template or when the cluster is
unavailable; update the logic around the lookup call (the $storageClasses
population loop) to guard the result by using default/get (or assign the lookup
result to a variable, then use default to an empty dict/list and use get to read
.items) so the loop over storage classes only runs when items exist and the
fallback that sets "replicated" as default still executes; adjust references to
the lookup result, .items, and the $storageClasses population block accordingly
and keep the final $storageClasses | toYaml | nindent 6 unchanged.
There was a problem hiding this comment.
Same as above — not applicable. These templates are only rendered by FluxCD against a live cluster, not via helm template. The (lookup ...).items pattern is the established convention across the codebase. Adding nil guards here would be inconsistent with the rest of the project.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for lookup patterns that directly access .items to verify the established convention
rg -n '\.items' packages/ -A 1 -B 2 | grep -E '(lookup|\.items)' | head -40
Length of output: 6624
🏁 Script executed:
#!/bin/bash
# Search specifically for the pattern of lookup followed by .items access
rg -nP 'lookup\s+[^)]+\)\.items' packages/ -C 2
Length of output: 4573
@mattia-eleuteri, you're absolutely right—thank you for the clarification! I can see from the search results that the (lookup ...).items pattern is consistently used throughout the codebase (in linstor-scheduler templates, cluster.yaml, driver-config.yaml, etc.), and since these templates are exclusively rendered by FluxCD against live clusters, the nil-guard concern doesn't apply. I appreciate you pointing out the project's conventions.
✏️ Learnings added
Learnt from: mattia-eleuteri
Repo: cozystack/cozystack PR: 2095
File: packages/apps/kubernetes/templates/helmreleases/csi.yaml:30-42
Timestamp: 2026-02-24T18:01:02.525Z
Learning: In the cozystack/cozystack repository, Helm templates under `packages/apps/kubernetes/templates/` are only rendered by FluxCD against live clusters (not via `helm template`). The established convention is to directly access `(lookup "..." "" "" "").items` without nil guards. This pattern is used consistently across multiple templates (e.g., linstor-scheduler/_helpers.tpl, cluster.yaml, driver-config.yaml, helmreleases/csi.yaml), so adding defensive checks would be inconsistent with the codebase style.
<!-- [/add_learning]
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: sircthulhu
Repo: cozystack/cozystack PR: 2068
File: packages/system/cozystack-basics/templates/tenant-root.yaml:7-11
Timestamp: 2026-02-18T15:45:24.589Z
Learning: In the cozystack/cozystack repository, for the tenant-root HelmRelease in `packages/system/cozystack-basics/templates/tenant-root.yaml`, users are only supposed to modify the `values` field, not `valuesFrom`. Therefore, preserving only `spec.values` during reconciliation is sufficient and intentional.
Learnt from: mattia-eleuteri
Repo: cozystack/cozystack PR: 1988
File: packages/extra/external-dns/templates/external-dns.yaml:22-31
Timestamp: 2026-02-12T11:03:15.183Z
Learning: In the cozystack/cozystack repository, `valuesFrom` is not correctly supported with Cozystack's HelmRelease implementation. Inline credential values in HelmRelease specs are the expected pattern, and suggestions to move credentials to `valuesFrom` with Secret references should be avoided.
Learnt from: lexfrei
Repo: cozystack/cozystack PR: 2010
File: packages/core/installer/values.yaml:10-26
Timestamp: 2026-02-10T10:44:34.963Z
Learning: In the cozystack/cozystack repository, the Makefile uses `helm template -s` to selectively render templates for different modes (talos, generic, hosted). Because Helm evaluates all templates regardless of `-s` filtering, using `required` checks in mode-specific templates (e.g., generic-mode templates) would break the build for other modes. Therefore, avoid suggesting `required` validation for mode-specific values in Helm templates in this repository.
Learnt from: kvaps
Repo: cozystack/cozystack PR: 1698
File: packages/core/flux-aio/templates/_helpers.tpl:3-3
Timestamp: 2025-12-10T11:40:12.476Z
Learning: Helpers taken from flux-operator (such as kubernetes_envs helpers) should be kept as-is and not modified, even if they appear to have unconventional patterns like using `dict` as a default for array fields. These have been battle-tested upstream.
Learnt from: lexfrei
Repo: cozystack/cozystack PR: 2058
File: packages/apps/harbor/templates/harbor.yaml:53-67
Timestamp: 2026-02-16T18:39:00.608Z
Learning: In the cozystack/cozystack repository, standard Flux HelmRelease v2 resources correctly support `valuesFrom` with `targetPath` for injecting Secret values. This is processed by Flux's helm-controller. The retry mechanism (`retries: -1`) handles cases where secrets don't exist yet on first install (e.g., COSI-provisioned bucket secrets).
2f3d3cf to
78891d1
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
hack/e2e-apps/run-kubernetes.sh (1)
271-285:⚠️ Potential issue | 🟡 MinorAvoid hard-coding
replicatedin the RWX PVC.
Auto-propagation doesn’t guarantee areplicatedclass; environments with different defaults will fail this test. Prefer the default propagated class (or first available) and fall back toreplicated.🔧 Proposed fix
@@ - # Test RWX NFS mount in tenant cluster (uses kubevirt CSI driver with RWX support) + # Test RWX NFS mount in tenant cluster (uses kubevirt CSI driver with RWX support) + default_sc=$(kubectl --kubeconfig "tenantkubeconfig-${test_name}" get sc \ + -o jsonpath='{range .items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")]}{.metadata.name}{"\n"}{end}' | head -n1) + if [ -z "$default_sc" ]; then + default_sc=$(echo "$sc_names" | awk '{print $1}') + fi + default_sc=${default_sc:-replicated} @@ - storageClassName: replicated + storageClassName: ${default_sc}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hack/e2e-apps/run-kubernetes.sh` around lines 271 - 285, The PVC creation hard-codes storageClassName: replicated which can fail if that class doesn't exist; instead, before creating nfs-test-pvc (namespace tenant-test, name nfs-test-pvc), detect a valid storage class by querying Kubernetes for the default propagated class (storageclass.kubernetes.io/is-default-class=true) or the first available storageclass and set a variable (e.g. STORAGE_CLASS) and then use that variable in the PVC manifest; if no storageclass is found, fall back to "replicated". Update the script around the PVC creation to compute STORAGE_CLASS using kubectl (using the test_name kubeconfig) and substitute that variable into the here-doc storageClassName field.
♻️ Duplicate comments (3)
packages/system/kubevirt-csi-node/templates/deploy.yaml (1)
267-267:⚠️ Potential issue | 🟠 MajorConfigMap data value renders null; use an explicit empty string.
infraClusterNamespace:with no value becomesnull, which is invalid for ConfigMap data (expects string values).🔧 Proposed fix
- infraClusterNamespace: + infraClusterNamespace: ""Verification (ensure the rendered manifest has no null value):
#!/bin/bash set -euo pipefail helm template kubevirt-csi-node packages/system/kubevirt-csi-node > /tmp/kubevirt-csi-node.yaml if rg -n 'infraClusterNamespace:\s*$' /tmp/kubevirt-csi-node.yaml; then echo "infraClusterNamespace rendered as null" exit 1 fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/system/kubevirt-csi-node/templates/deploy.yaml` at line 267, The ConfigMap key infraClusterNamespace in the deploy.yaml template is rendering as null because it has no explicit value; update the template so the ConfigMap data emits an explicit empty string for infraClusterNamespace (e.g., render the value from .Values.infraClusterNamespace with a default of "" or otherwise ensure it writes ""), ensuring the rendered manifest contains infraClusterNamespace: "" rather than infraClusterNamespace: and thus avoids a null value in the ConfigMap data.packages/apps/kubernetes/templates/helmreleases/csi.yaml (1)
30-42:⚠️ Potential issue | 🟠 MajorGuard
lookupbefore accessing.itemsto keep fallback working.
Whenlookupreturns nil (e.g.,helm template/dry-run),.itemsderef errors and thereplicatedfallback never renders.🔧 Proposed fix
storageClasses: {{- $storageClasses := dict }} - {{- range $sc := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }} + {{- $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "" | default dict }} + {{- range $sc := (get $scList "items" | default list) }}Verification (render without a live cluster and confirm no template error):
#!/bin/bash set -euo pipefail helm template test packages/apps/kubernetes --dry-run >/tmp/render.yaml🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/apps/kubernetes/templates/helmreleases/csi.yaml` around lines 30 - 42, The template dereferences `.items` on the result of lookup which can be nil during dry-run; update the block using lookup by assigning its result to a variable (e.g., $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "") and guard access with a conditional (e.g., `if $scList` and/or `if $scList.items`) before iterating `range $sc := $scList.items`, so that when lookup returns nil the code falls through to the existing `replicated` fallback; keep the rest of the logic using $storageClasses, $isDefault and set unchanged.packages/apps/kubernetes/templates/csi/driver-config.yaml (1)
6-17:⚠️ Potential issue | 🟠 MajorProtect
.itemsaccess on thelookupresult.
lookupcan be nil inhelm template/dry-run, causing a template error and skipping the fallback allowList.🔧 Proposed fix
- {{- $storageClasses := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }} + {{- $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "" | default dict }} + {{- $storageClasses := (get $scList "items" | default list) }} {{- if $storageClasses }}Verification (render without a live cluster and confirm no template error):
#!/bin/bash set -euo pipefail helm template test packages/apps/kubernetes --dry-run >/tmp/render.yaml🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/apps/kubernetes/templates/csi/driver-config.yaml` around lines 6 - 17, The template accesses .items on the lookup result without guarding for a nil return, which fails during helm template/dry-run; update infraStorageClassEnforcement to first capture the lookup result (the value bound to $storageClasses or similar) and check that it is non-nil (e.g., use an if/and or separate if) before iterating over or referencing .items so the else fallback ("replicated") is used when lookup returns nil; ensure you reference the same variable name ($storageClasses) and protect all uses of .items from a nil lookup.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@hack/e2e-apps/run-kubernetes.sh`:
- Around line 271-285: The PVC creation hard-codes storageClassName: replicated
which can fail if that class doesn't exist; instead, before creating
nfs-test-pvc (namespace tenant-test, name nfs-test-pvc), detect a valid storage
class by querying Kubernetes for the default propagated class
(storageclass.kubernetes.io/is-default-class=true) or the first available
storageclass and set a variable (e.g. STORAGE_CLASS) and then use that variable
in the PVC manifest; if no storageclass is found, fall back to "replicated".
Update the script around the PVC creation to compute STORAGE_CLASS using kubectl
(using the test_name kubeconfig) and substitute that variable into the here-doc
storageClassName field.
---
Duplicate comments:
In `@packages/apps/kubernetes/templates/csi/driver-config.yaml`:
- Around line 6-17: The template accesses .items on the lookup result without
guarding for a nil return, which fails during helm template/dry-run; update
infraStorageClassEnforcement to first capture the lookup result (the value bound
to $storageClasses or similar) and check that it is non-nil (e.g., use an if/and
or separate if) before iterating over or referencing .items so the else fallback
("replicated") is used when lookup returns nil; ensure you reference the same
variable name ($storageClasses) and protect all uses of .items from a nil
lookup.
In `@packages/apps/kubernetes/templates/helmreleases/csi.yaml`:
- Around line 30-42: The template dereferences `.items` on the result of lookup
which can be nil during dry-run; update the block using lookup by assigning its
result to a variable (e.g., $scList := lookup "storage.k8s.io/v1" "StorageClass"
"" "") and guard access with a conditional (e.g., `if $scList` and/or `if
$scList.items`) before iterating `range $sc := $scList.items`, so that when
lookup returns nil the code falls through to the existing `replicated` fallback;
keep the rest of the logic using $storageClasses, $isDefault and set unchanged.
In `@packages/system/kubevirt-csi-node/templates/deploy.yaml`:
- Line 267: The ConfigMap key infraClusterNamespace in the deploy.yaml template
is rendering as null because it has no explicit value; update the template so
the ConfigMap data emits an explicit empty string for infraClusterNamespace
(e.g., render the value from .Values.infraClusterNamespace with a default of ""
or otherwise ensure it writes ""), ensuring the rendered manifest contains
infraClusterNamespace: "" rather than infraClusterNamespace: and thus avoids a
null value in the ConfigMap data.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
hack/e2e-apps/run-kubernetes.shpackages/apps/kubernetes/README.mdpackages/apps/kubernetes/templates/csi/driver-config.yamlpackages/apps/kubernetes/templates/helmreleases/csi.yamlpackages/apps/kubernetes/values.schema.jsonpackages/apps/kubernetes/values.yamlpackages/system/kubernetes-rd/cozyrds/kubernetes.yamlpackages/system/kubevirt-csi-node/templates/deploy.yamlpackages/system/kubevirt-csi-node/values.yaml
💤 Files with no reviewable changes (3)
- packages/apps/kubernetes/README.md
- packages/apps/kubernetes/values.schema.json
- packages/apps/kubernetes/values.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/system/kubevirt-csi-node/values.yaml
b955a35 to
9883e5c
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@hack/e2e-apps/run-kubernetes.sh`:
- Around line 157-163: The script currently verifies there is exactly one
default StorageClass using the jsonpath kubectl call (setting default_count) but
elsewhere hardcodes the name "replicated"; change the code to capture the
default StorageClass name from the same jsonpath query (e.g., into a variable
like default_sc_name) after asserting default_count == 1, and replace any
hardcoded "replicated" usages with that default_sc_name (apply the same fix for
the later block around the other StorageClass check). Ensure you keep the
existing exact-one-default assertion (default_count) and use the discovered
default_sc_name for subsequent operations.
In `@packages/core/platform/images/migrations/migrations/33`:
- Around line 59-65: The kubectl jsonpath used to read the secret value (the
command that sets VALUES_YAML using VALUES_KEY) fails when VALUES_KEY contains
dots because jsonpath treats dots as separators; update the two kubectl jsonpath
expressions that reference "{.data.${VALUES_KEY}}" and
"{.stringData.${VALUES_KEY}}" to use bracket notation that escapes the key
(e.g., {.data[\"<key>\"]} and {.stringData[\"<key>\"]}) so keys with dots are
read correctly; locate the lines that build VALUES_YAML (the kubectl -o jsonpath
calls) and replace their jsonpath patterns to use the quoted bracket form
consistent with the jq patching later.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
hack/e2e-apps/run-kubernetes.shpackages/apps/kubernetes/README.mdpackages/apps/kubernetes/templates/csi/driver-config.yamlpackages/apps/kubernetes/templates/helmreleases/csi.yamlpackages/apps/kubernetes/values.schema.jsonpackages/apps/kubernetes/values.yamlpackages/core/platform/images/migrations/migrations/33packages/core/platform/values.yamlpackages/system/kubernetes-rd/cozyrds/kubernetes.yamlpackages/system/kubevirt-csi-node/templates/deploy.yamlpackages/system/kubevirt-csi-node/values.yaml
💤 Files with no reviewable changes (3)
- packages/apps/kubernetes/values.schema.json
- packages/apps/kubernetes/README.md
- packages/apps/kubernetes/values.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/system/kubernetes-rd/cozyrds/kubernetes.yaml
| # Verify exactly one default StorageClass is set | ||
| default_count=$(kubectl --kubeconfig "tenantkubeconfig-${test_name}" get sc \ | ||
| -o jsonpath='{range .items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")]}{.metadata.name}{"\n"}{end}' | grep -c .) | ||
| if [ "$default_count" -ne 1 ]; then | ||
| echo "Expected exactly 1 default StorageClass, found $default_count" >&2 | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
Avoid hardcoding replicated; use the discovered default StorageClass.
With dynamic propagation, the default StorageClass name may differ, so hardcoding replicated makes the test brittle. Use the annotated default class name (and keep the “exactly one default” assertion if that’s still desired).
🛠️ Proposed fix
# Verify exactly one default StorageClass is set
- default_count=$(kubectl --kubeconfig "tenantkubeconfig-${test_name}" get sc \
+ default_scs=$(kubectl --kubeconfig "tenantkubeconfig-${test_name}" get sc \
-o jsonpath='{range .items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")]}{.metadata.name}{"\n"}{end}' | grep -c .)
- if [ "$default_count" -ne 1 ]; then
- echo "Expected exactly 1 default StorageClass, found $default_count" >&2
+ default_count=$(echo "$default_scs" | grep -c .)
+ if [ "$default_count" -ne 1 ]; then
+ echo "Expected exactly 1 default StorageClass, found $default_count" >&2
exit 1
fi
echo "Default StorageClass is correctly set"
+ default_sc=$(echo "$default_scs" | head -n1) storageClassName: replicated
+ storageClassName: ${default_sc}Also applies to: 271-281
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@hack/e2e-apps/run-kubernetes.sh` around lines 157 - 163, The script currently
verifies there is exactly one default StorageClass using the jsonpath kubectl
call (setting default_count) but elsewhere hardcodes the name "replicated";
change the code to capture the default StorageClass name from the same jsonpath
query (e.g., into a variable like default_sc_name) after asserting default_count
== 1, and replace any hardcoded "replicated" usages with that default_sc_name
(apply the same fix for the later block around the other StorageClass check).
Ensure you keep the existing exact-one-default assertion (default_count) and use
the discovered default_sc_name for subsequent operations.
| # --- Decode current values --- | ||
| VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \ | ||
| -o jsonpath="{.data.${VALUES_KEY}}" 2>/dev/null | base64 -d 2>/dev/null || true) | ||
| if [ -z "$VALUES_YAML" ]; then | ||
| VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \ | ||
| -o jsonpath="{.stringData.${VALUES_KEY}}" 2>/dev/null || true) | ||
| fi |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find the migration file
find . -path "*/packages/core/platform/images/migrations/migrations/33" -type fRepository: cozystack/cozystack
Length of output: 119
🏁 Script executed:
# Also search more broadly for the migrations directory
fd -t f "33" packages/core/platform/images/migrations/ 2>/dev/null | head -20Repository: cozystack/cozystack
Length of output: 117
🏁 Script executed:
# List what's in the migrations directory
ls -la packages/core/platform/images/migrations/ 2>/dev/null || echo "Path not found"Repository: cozystack/cozystack
Length of output: 360
🏁 Script executed:
# Read the migration script file
cat -n packages/core/platform/images/migrations/migrations/33Repository: cozystack/cozystack
Length of output: 4619
🏁 Script executed:
# Also check the run-migrations.sh to understand the context
cat -n packages/core/platform/images/migrations/run-migrations.shRepository: cozystack/cozystack
Length of output: 1713
Fix kubectl jsonpath bracket notation for keys with dots.
kubectl -o jsonpath="{.data.${VALUES_KEY}}" fails when the key contains dots (default values.yaml), as jsonpath treats dots as path separators. This causes VALUES_YAML to remain empty and the migration to skip valid instances. The .stringData fallback won't help on live Secrets. The script later uses jq correctly with bracket notation at line 89-91 for patching; apply the same pattern here for reading.
🛠️ Proposed fix
- VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
- -o jsonpath="{.data.${VALUES_KEY}}" 2>/dev/null | base64 -d 2>/dev/null || true)
- if [ -z "$VALUES_YAML" ]; then
- VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
- -o jsonpath="{.stringData.${VALUES_KEY}}" 2>/dev/null || true)
- fi
+ SECRET_JSON=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" -o json 2>/dev/null)
+ VALUES_B64=$(echo "$SECRET_JSON" | jq -r --arg key "$VALUES_KEY" '.data[$key] // empty')
+ VALUES_YAML=""
+ if [ -n "$VALUES_B64" ]; then
+ VALUES_YAML=$(echo "$VALUES_B64" | base64 -d 2>/dev/null || true)
+ fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/core/platform/images/migrations/migrations/33` around lines 59 - 65,
The kubectl jsonpath used to read the secret value (the command that sets
VALUES_YAML using VALUES_KEY) fails when VALUES_KEY contains dots because
jsonpath treats dots as separators; update the two kubectl jsonpath expressions
that reference "{.data.${VALUES_KEY}}" and "{.stringData.${VALUES_KEY}}" to use
bracket notation that escapes the key (e.g., {.data[\"<key>\"]} and
{.stringData[\"<key>\"]}) so keys with dots are read correctly; locate the lines
that build VALUES_YAML (the kubectl -o jsonpath calls) and replace their
jsonpath patterns to use the quoted bracket form consistent with the jq patching
later.
kvaps
left a comment
There was a problem hiding this comment.
Thank you for this great contribution! Auto-propagating infrastructure StorageClasses to tenant clusters is a really nice improvement.
However, I see a breaking change that needs to be addressed before merging. Currently, the kubevirt-csi-node chart creates a single StorageClass named kubevirt inside tenant clusters. After this PR, it will create StorageClasses named after infrastructure classes (e.g., replicated), and the kubevirt StorageClass will disappear since it's no longer generated by the template.
This will break existing workloads in tenant clusters:
- StatefulSets that reference
storageClassName: kubevirtin theirvolumeClaimTemplateswon't be able to scale up or create new replicas - Any manifests with hardcoded
storageClassName: kubevirtwill fail to provision new volumes - Already bound PVCs will continue to work, but the dangling reference is still undesirable
The migration script (33) only removes the storageClass key from HelmRelease values Secrets on the management cluster side. We need an additional migration step that updates the storageClassName for existing PVCs inside each tenant Kubernetes cluster.
I'd love to include this in the 1.0 release since it changes existing behavior, but we've already cut the release candidate and there isn't enough time to properly test the migration path for tenant-side PVCs. Let's target this for 1.1 instead, so we can give it the testing it deserves.
|
Thanks for the review and the thoughtful feedback! You're absolutely right that this is a breaking change — the I looked into the tenant-side PVC migration you mentioned, and there's a blocker: The approach I had in mind to address this is a backwards-compatibility shim in the Helm template: after auto-discovering infra StorageClasses, the template would always inject a
That said, I completely agree this deserves proper testing before merging. Happy to target 1.1 as you suggested — that gives us time to validate the full migration path (compat shim + migration 33 for management-side cleanup) thoroughly. Let me know if the compat shim approach sounds reasonable to you, and I'll keep the PR updated for when 1.1 development starts. |
|
It is possible to recreate PVC with correct storageclass even with no
downtime
Example project using this
https://github.com/piraeusdatastore/linstor-affinity-controller
The only problem that this kind of migration is quite difficult and should
be properly tested on live environment
Best Regards,
Andrei Kvapil
…On Wed, 25 Feb 2026, 10:30 mattia-eleuteri, ***@***.***> wrote:
*mattia-eleuteri* left a comment (cozystack/cozystack#2095)
<#2095 (comment)>
Thanks for the review and the thoughtful feedback!
You're absolutely right that this is a breaking change — the kubevirt
StorageClass disappearing from tenant clusters would break existing
workloads.
I looked into the tenant-side PVC migration you mentioned, and there's a
blocker: PVC.spec.storageClassName is immutable in Kubernetes (as are StatefulSet
volumeClaimTemplates). So we can't patch existing PVCs to reference the
new StorageClass names.
The approach I had in mind to address this is a backwards-compatibility
shim in the Helm template: after auto-discovering infra StorageClasses, the
template would always inject a kubevirt entry (unless an infra SC already
has that name) that maps to the default infra StorageClass. This way:
- Existing workloads with storageClassName: kubevirt keep working
- New workloads can use the real infra SC names
- The kubevirt SC is non-default, so new PVCs naturally use the
propagated names
That said, I completely agree this deserves proper testing before merging.
Happy to target 1.1 as you suggested — that gives us time to validate the
full migration path (compat shim + migration 33 for management-side
cleanup) thoroughly.
Let me know if the compat shim approach sounds reasonable to you, and I'll
keep the PR updated for when 1.1 development starts.
—
Reply to this email directly, view it on GitHub
<#2095 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZUY6JJ2CYX3BZSN64EYBL4NVTTLAVCNFSM6AAAAACV6J43EKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTSNJXHE2DIOBZGU>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
|
4865cff to
7e19133
Compare
|
hey @kvaps we could organize testing on our lab infra BTW. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/apps/kubernetes/templates/helmreleases/csi.yaml`:
- Around line 31-41: The template currently builds $storageClasses from
discovered StorageClass names but doesn't add a compatibility alias for
"kubevirt", which can break provisioning when existing
volumeClaimTemplates.spec.storageClassName is set to kubevirt; update the
generation logic around $storageClasses (where lookup "storage.k8s.io/v1"
"StorageClass" is iterated and $_ := set $storageClasses is called) to ensure
you always set a "kubevirt" key mapping to an appropriate infraStorageClass
(e.g., point "kubevirt" to an existing infraStorageClass like "replicated" or
the discovered default) so lookups for kubevirt succeed during
scale-out/recreate flows.
In `@packages/core/platform/images/migrations/migrations/35`:
- Around line 42-45: The jsonpath used later to read secret data fails for keys
that contain dots; update any kubectl jsonpath expressions that reference secret
data using the VALUES_KEY variable (looks for patterns like
{.data.${VALUES_KEY}}) to use bracket notation instead (e.g.
{.data['${VALUES_KEY}']}) so dotted keys (for example "values.yaml" or other
dotted secret keys) are accessed correctly; change the jsonpath usages that
consume VALUES_KEY and ensure the VALUES_SECRET/VALUES_KEY variable names remain
unchanged.
- Around line 60-64: The JSONPath reads using kubectl with
"{.data.${VALUES_KEY}}" are unsafe for dotted secret keys (VALUES_KEY like
"values.yaml"); replace those jsonpath reads with a safe JSON read via kubectl
-o json piped into jq using an index lookup with --arg: call kubectl -n
"$NAMESPACE" get secret "$VALUES_SECRET" -o json | jq -r --arg key "$VALUES_KEY"
'.data[$key] // .stringData[$key]' and then base64-decode the result into
VALUES_YAML (same approach used for writes), updating the code paths that set
VALUES_YAML (the kubectl/jsonpath lines that reference VALUES_KEY,
VALUES_SECRET, NAMESPACE) so dotted keys are handled correctly; apply the same
fix to migrations 29 and 30.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
hack/e2e-apps/run-kubernetes.shpackages/apps/kubernetes/README.mdpackages/apps/kubernetes/templates/csi/driver-config.yamlpackages/apps/kubernetes/templates/helmreleases/csi.yamlpackages/apps/kubernetes/values.schema.jsonpackages/apps/kubernetes/values.yamlpackages/core/platform/images/migrations/migrations/35packages/core/platform/values.yamlpackages/system/kubernetes-rd/cozyrds/kubernetes.yamlpackages/system/kubevirt-csi-node/templates/deploy.yamlpackages/system/kubevirt-csi-node/values.yaml
💤 Files with no reviewable changes (3)
- packages/apps/kubernetes/README.md
- packages/apps/kubernetes/values.yaml
- packages/apps/kubernetes/values.schema.json
🚧 Files skipped from review as they are similar to previous changes (3)
- packages/apps/kubernetes/templates/csi/driver-config.yaml
- packages/system/kubevirt-csi-node/values.yaml
- hack/e2e-apps/run-kubernetes.sh
| {{- $storageClasses := dict }} | ||
| {{- range $sc := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }} | ||
| {{- $isDefault := false }} | ||
| {{- if $sc.metadata.annotations }} | ||
| {{- $isDefault = eq (index $sc.metadata.annotations "storageclass.kubernetes.io/is-default-class" | default "false") "true" }} | ||
| {{- end }} | ||
| {{- $_ := set $storageClasses $sc.metadata.name (dict "infraStorageClass" $sc.metadata.name "default" $isDefault) }} | ||
| {{- end }} | ||
| {{- if not $storageClasses }} | ||
| {{- $_ := set $storageClasses "replicated" (dict "infraStorageClass" "replicated" "default" true) }} | ||
| {{- end }} |
There was a problem hiding this comment.
Add a kubevirt compatibility alias to avoid post-upgrade PVC creation failures.
Line 31–Line 41 only propagates discovered infra names. Existing immutable volumeClaimTemplates.spec.storageClassName: kubevirt can still exist; once kubevirt disappears, later scale-out/recreate flows can fail to provision PVCs.
🔧 Proposed compatibility shim
- {{- $storageClasses := dict }}
+ {{- $storageClasses := dict }}
+ {{- $defaultInfra := "" }}
{{- range $sc := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
{{- $isDefault := false }}
{{- if $sc.metadata.annotations }}
{{- $isDefault = eq (index $sc.metadata.annotations "storageclass.kubernetes.io/is-default-class" | default "false") "true" }}
{{- end }}
+ {{- if and $isDefault (eq $defaultInfra "") }}
+ {{- $defaultInfra = $sc.metadata.name }}
+ {{- end }}
{{- $_ := set $storageClasses $sc.metadata.name (dict "infraStorageClass" $sc.metadata.name "default" $isDefault) }}
{{- end }}
{{- if not $storageClasses }}
{{- $_ := set $storageClasses "replicated" (dict "infraStorageClass" "replicated" "default" true) }}
{{- end }}
+ {{- if not (hasKey $storageClasses "kubevirt") }}
+ {{- $_ := set $storageClasses "kubevirt" (dict "infraStorageClass" ($defaultInfra | default "replicated") "default" false) }}
+ {{- end }}
{{- $storageClasses | toYaml | nindent 6 }}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/apps/kubernetes/templates/helmreleases/csi.yaml` around lines 31 -
41, The template currently builds $storageClasses from discovered StorageClass
names but doesn't add a compatibility alias for "kubevirt", which can break
provisioning when existing volumeClaimTemplates.spec.storageClassName is set to
kubevirt; update the generation logic around $storageClasses (where lookup
"storage.k8s.io/v1" "StorageClass" is iterated and $_ := set $storageClasses is
called) to ensure you always set a "kubevirt" key mapping to an appropriate
infraStorageClass (e.g., point "kubevirt" to an existing infraStorageClass like
"replicated" or the discovered default) so lookups for kubevirt succeed during
scale-out/recreate flows.
| VALUES_SECRET=$(kubectl -n "$NAMESPACE" get hr "$HR_NAME" -o json | \ | ||
| jq -r '.spec.valuesFrom // [] | map(select(.kind == "Secret" and (.name | test("cozystack-values") | not))) | .[0].name // ""') | ||
| VALUES_KEY=$(kubectl -n "$NAMESPACE" get hr "$HR_NAME" -o json | \ | ||
| jq -r '.spec.valuesFrom // [] | map(select(.kind == "Secret" and (.name | test("cozystack-values") | not))) | .[0].valuesKey // "values.yaml"') |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Locate and examine the migration script
find . -path "*/packages/core/platform/images/migrations/migrations/35" -type fRepository: cozystack/cozystack
Length of output: 119
🏁 Script executed:
# Check if the file exists and examine its content
if [ -f "packages/core/platform/images/migrations/migrations/35" ]; then
wc -l "packages/core/platform/images/migrations/migrations/35"
echo "---"
cat -n "packages/core/platform/images/migrations/migrations/35"
else
echo "File not found at expected path"
# Try to find it with a broader search
find . -name "35" -type f 2>/dev/null | head -20
fiRepository: cozystack/cozystack
Length of output: 10642
🏁 Script executed:
# Search for any references to valuesFrom with cozystack-values in YAML templates
rg "valuesFrom" --type yaml -B 3 -A 3 | head -100Repository: cozystack/cozystack
Length of output: 6490
🏁 Script executed:
# Check for patterns of secret naming in HelmRelease specs
rg "cozystack-values" --type yaml -B 2 -A 2Repository: cozystack/cozystack
Length of output: 5481
Fix jsonpath bracket notation for dotted secret keys.
The selector logic (lines 43, 45) is correct—it intentionally excludes cozystack-values to find custom values secrets. However, lines 61 and 64 fail to read from the default key values.yaml because the jsonpath syntax breaks for dotted keys. Use bracket notation {.data['${VALUES_KEY}']} instead of {.data.${VALUES_KEY}} to correctly access keys with dots. This prevents the migration from silently skipping all Kubernetes instances.
🔧 Fix: Use bracket notation for dotted keys
# --- Decode current values ---
VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
- -o jsonpath="{.data.${VALUES_KEY}}" 2>/dev/null | base64 -d 2>/dev/null || true)
+ -o jsonpath="{.data['${VALUES_KEY}']}" 2>/dev/null | base64 -d 2>/dev/null || true)
if [ -z "$VALUES_YAML" ]; then
VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
- -o jsonpath="{.stringData.${VALUES_KEY}}" 2>/dev/null || true)
+ -o jsonpath="{.stringData['${VALUES_KEY}']}" 2>/dev/null || true)
fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/core/platform/images/migrations/migrations/35` around lines 42 - 45,
The jsonpath used later to read secret data fails for keys that contain dots;
update any kubectl jsonpath expressions that reference secret data using the
VALUES_KEY variable (looks for patterns like {.data.${VALUES_KEY}}) to use
bracket notation instead (e.g. {.data['${VALUES_KEY}']}) so dotted keys (for
example "values.yaml" or other dotted secret keys) are accessed correctly;
change the jsonpath usages that consume VALUES_KEY and ensure the
VALUES_SECRET/VALUES_KEY variable names remain unchanged.
| VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \ | ||
| -o jsonpath="{.data.${VALUES_KEY}}" 2>/dev/null | base64 -d 2>/dev/null || true) | ||
| if [ -z "$VALUES_YAML" ]; then | ||
| VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \ | ||
| -o jsonpath="{.stringData.${VALUES_KEY}}" 2>/dev/null || true) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's check if the file exists and examine the actual code
find . -type f -path "*packages/core/platform/images/migrations/migrations/35*"Repository: cozystack/cozystack
Length of output: 119
🏁 Script executed:
# Also search for migration files in that general area
fd -t f "35" packages/core/platform/images/migrations/ 2>/dev/null | head -20Repository: cozystack/cozystack
Length of output: 117
🏁 Script executed:
# Broader search for migration-related files
find packages/core -type f -name "35*" 2>/dev/nullRepository: cozystack/cozystack
Length of output: 117
🏁 Script executed:
# Check the structure of the migrations directory
ls -la packages/core/platform/images/migrations/ 2>/dev/null || echo "Directory not found at expected path"Repository: cozystack/cozystack
Length of output: 360
🏁 Script executed:
# Look for any migration files with the specific code pattern
rg -l "VALUES_YAML.*\-o jsonpath" packages/core/ 2>/dev/nullRepository: cozystack/cozystack
Length of output: 117
🏁 Script executed:
# Read the migration file 35
cat -n packages/core/platform/images/migrations/migrations/35Repository: cozystack/cozystack
Length of output: 10579
🏁 Script executed:
# Also check what VALUES_KEY is set to (if at all)
grep -n "VALUES_KEY" packages/core/platform/images/migrations/migrations/35Repository: cozystack/cozystack
Length of output: 438
🏁 Script executed:
# Test JSONPath with dotted keys - verify the issue exists
# Create a temporary test to show the difference
# Using kubectl dry-run to avoid cluster write
echo "=== Test 1: Unescaped dotted key (incorrect) ==="
kubectl create secret generic test-secret --from-literal='values.yaml=testdata123' --dry-run=client -o json | \
jq -r '.data | keys'
echo ""
echo "=== Test 2: JSONPath with unescaped dot ==="
kubectl create secret generic test-secret --from-literal='values.yaml=testdata123' --dry-run=client \
-o jsonpath='{.data.values.yaml}' 2>/dev/null || echo "(returns empty or error)"
echo ""
echo "=== Test 3: JSONPath with escaped dot ==="
kubectl create secret generic test-secret --from-literal='values.yaml=testdata123' --dry-run=client \
-o jsonpath='{.data.values\.yaml}' 2>/dev/null
echo ""
echo "=== Test 4: Safe method using jq ==="
kubectl create secret generic test-secret --from-literal='values.yaml=testdata123' --dry-run=client -o json | \
jq -r --arg key 'values.yaml' '.data[$key]'Repository: cozystack/cozystack
Length of output: 372
🏁 Script executed:
# Let me verify the JSONPath behavior claim by checking Kubernetes documentation
# and examining if there are any tests in the repo related to secret key handling
# Search for other uses of VALUES_KEY or dotted keys in secrets in the codebase
rg -n "values\.yaml|VALUES_KEY" packages/core/platform/images/migrations/ -A 2 -B 2Repository: cozystack/cozystack
Length of output: 8835
🏁 Script executed:
# Check if there are other migration scripts that handle similar secret key reads
fd -t f "migrations" packages/core/platform/images/migrations/ -exec wc -l {} \; | sort -nRepository: cozystack/cozystack
Length of output: 208
🏁 Script executed:
# Look for any tests or documentation about secret key escaping
fd -t f "\\.md|\\.sh" packages/core/ -exec grep -l "values\.yaml\|dotted.*key\|jsonpath.*escape" {} \;Repository: cozystack/cozystack
Length of output: 300
🏁 Script executed:
# Check if there are other JSONPath patterns with dots in the codebase that might have similar issues
rg "jsonpath.*\$\{.*\}" packages/core/ -n | head -20Repository: cozystack/cozystack
Length of output: 45
🏁 Script executed:
# Examine migration 30 to see if it has the same issue pattern
cat -n packages/core/platform/images/migrations/migrations/30 | sed -n '60,65p'Repository: cozystack/cozystack
Length of output: 467
🏁 Script executed:
# Let me check if there's any documentation or comments in the migrations about this pattern
grep -B5 -A5 "jsonpath" packages/core/platform/images/migrations/migrations/35 | head -30Repository: cozystack/cozystack
Length of output: 1172
🏁 Script executed:
# Check what kubernetes documentation says about JSONPath escaping by examining any test or validation logic
rg "escape|backslash|\\\\." packages/core/platform/images/migrations/ -nRepository: cozystack/cozystack
Length of output: 7541
🏁 Script executed:
# Key finding: check if dotted keys are escaped elsewhere in the same script
grep -n "jsonpath.*\\\\\." packages/core/platform/images/migrations/migrations/35Repository: cozystack/cozystack
Length of output: 308
🏁 Script executed:
# Check all the valuesKey defaults across migrations to confirm they use dotted keys
rg "valuesKey.*default|valuesKey.*values\\.yaml" packages/core/platform/images/migrations/ -nRepository: cozystack/cozystack
Length of output: 857
Fix unsafe JSONPath key access for dotted secret keys.
Lines 60-61 and 63-64 use JSONPath {.data.${VALUES_KEY}} with VALUES_KEY defaulting to values.yaml. In JSONPath, dotted keys must be escaped; unescaped dots are interpreted as nested object access. This causes the reads to fail silently, returning empty and triggering [SKIP] Could not read values from secret, skipping the migration entirely.
The script already demonstrates awareness of this at line 120 with escaped dots (super-admin\.svc), and already uses the safe pattern for writes (lines 88-91) with jq --arg key "$VALUES_KEY". Apply the same safe approach to reads:
🔧 Proposed fix
- VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
- -o jsonpath="{.data.${VALUES_KEY}}" 2>/dev/null | base64 -d 2>/dev/null || true)
+ SECRET_JSON=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" -o json)
+ VALUES_B64=$(echo "$SECRET_JSON" | jq -r --arg key "$VALUES_KEY" '.data[$key] // ""')
+ VALUES_YAML=$(printf '%s' "$VALUES_B64" | base64 -d 2>/dev/null || true)
if [ -z "$VALUES_YAML" ]; then
- VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
- -o jsonpath="{.stringData.${VALUES_KEY}}" 2>/dev/null || true)
+ VALUES_YAML=$(echo "$SECRET_JSON" | jq -r --arg key "$VALUES_KEY" '.stringData[$key] // ""')
fiNote: Migrations 29 and 30 contain the same vulnerable pattern.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \ | |
| -o jsonpath="{.data.${VALUES_KEY}}" 2>/dev/null | base64 -d 2>/dev/null || true) | |
| if [ -z "$VALUES_YAML" ]; then | |
| VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \ | |
| -o jsonpath="{.stringData.${VALUES_KEY}}" 2>/dev/null || true) | |
| SECRET_JSON=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" -o json) | |
| VALUES_B64=$(echo "$SECRET_JSON" | jq -r --arg key "$VALUES_KEY" '.data[$key] // ""') | |
| VALUES_YAML=$(printf '%s' "$VALUES_B64" | base64 -d 2>/dev/null || true) | |
| if [ -z "$VALUES_YAML" ]; then | |
| VALUES_YAML=$(echo "$SECRET_JSON" | jq -r --arg key "$VALUES_KEY" '.stringData[$key] // ""') |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/core/platform/images/migrations/migrations/35` around lines 60 - 64,
The JSONPath reads using kubectl with "{.data.${VALUES_KEY}}" are unsafe for
dotted secret keys (VALUES_KEY like "values.yaml"); replace those jsonpath reads
with a safe JSON read via kubectl -o json piped into jq using an index lookup
with --arg: call kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" -o json |
jq -r --arg key "$VALUES_KEY" '.data[$key] // .stringData[$key]' and then
base64-decode the result into VALUES_YAML (same approach used for writes),
updating the code paths that set VALUES_YAML (the kubectl/jsonpath lines that
reference VALUES_KEY, VALUES_SECRET, NAMESPACE) so dotted keys are handled
correctly; apply the same fix to migrations 29 and 30.
Replace the hardcoded 'kubevirt' StorageClass in tenant clusters with auto-discovered StorageClasses from the infrastructure cluster. Migration 35 handles the upgrade path: - STEP 1: Discover Kubernetes HelmReleases - STEP 2: Remove deprecated 'storageClass' from values Secrets - STEP 3: PVC rebinding in tenants (kubevirt → infra SC name) Co-Authored-By: Claude Opus 4.6 <[email protected]> Signed-off-by: mattia-eleuteri <[email protected]>
7e19133 to
416742f
Compare
What this PR does
Replaces the single static
storageClassparameter with automatic discovery and propagation of all infrastructure StorageClasses to tenant Kubernetes clusters.Changes:
kubevirt-csi-nodechart:storageClassvalue replaced by astorageClassesmap. The deploy template now loops over entries to create one StorageClass per infra class, and addsinfraStorageClassEnforcementto thedriver-configConfigMapkuberneteschart: Removes thestorageClassparameter from values and schema. Thehelmreleases/csi.yamltemplate uses Helmlookupto discover all infra StorageClasses at reconciliation time and passes them to the child chartdriver-config.yaml: ConfigMap on the management cluster side providing the enforcement allowList to the CSI controller via the existingconfigMapKeyRef(optional: true)storageClasskey from existing Kubernetes app HelmRelease values Secrets. Bumps platformtargetVersionfrom 33 to 34.How it works:
lookupenumerates all infra StorageClasses, including their default annotationinfraStorageClassNameparameterreplicatedwhenlookupreturns empty (dry-run /helm template)No Go code changes required — kubevirt-csi-driver already supports multiple StorageClasses natively.
Migration (existing deployments):
Migration script 33 runs automatically on upgrade. It discovers all Kubernetes HelmReleases, decodes their values Secret, removes the
storageClasskey if present, and patches the Secret back. The migration is idempotent — it skips instances where the key is already absent.Closes: #2094
Related: #1424
Release note
Summary by CodeRabbit
Release Notes
Refactor
storageClassparameter has been removed.Documentation