Skip to content

[kubernetes] Auto-propagate infra StorageClasses to tenant clusters#2095

Open
mattia-eleuteri wants to merge 1 commit intocozystack:mainfrom
mattia-eleuteri:feat/auto-propagate-storageclasses
Open

[kubernetes] Auto-propagate infra StorageClasses to tenant clusters#2095
mattia-eleuteri wants to merge 1 commit intocozystack:mainfrom
mattia-eleuteri:feat/auto-propagate-storageclasses

Conversation

@mattia-eleuteri
Copy link
Copy Markdown
Contributor

@mattia-eleuteri mattia-eleuteri commented Feb 24, 2026

What this PR does

Replaces the single static storageClass parameter with automatic discovery and propagation of all infrastructure StorageClasses to tenant Kubernetes clusters.

Changes:

  • kubevirt-csi-node chart: storageClass value replaced by a storageClasses map. The deploy template now loops over entries to create one StorageClass per infra class, and adds infraStorageClassEnforcement to the driver-config ConfigMap
  • kubernetes chart: Removes the storageClass parameter from values and schema. The helmreleases/csi.yaml template uses Helm lookup to discover all infra StorageClasses at reconciliation time and passes them to the child chart
  • New driver-config.yaml: ConfigMap on the management cluster side providing the enforcement allowList to the CSI controller via the existing configMapKeyRef (optional: true)
  • Migration 33: Automatically removes the deprecated storageClass key from existing Kubernetes app HelmRelease values Secrets. Bumps platform targetVersion from 33 to 34.

How it works:

  1. Flux reconciles the HelmRelease every 5m
  2. lookup enumerates all infra StorageClasses, including their default annotation
  3. Each is created in the tenant with the same name and infraStorageClassName parameter
  4. The allowList restricts tenants to only use propagated classes
  5. Fallback to replicated when lookup returns empty (dry-run / helm template)

No Go code changes required — kubevirt-csi-driver already supports multiple StorageClasses natively.

Migration (existing deployments):
Migration script 33 runs automatically on upgrade. It discovers all Kubernetes HelmReleases, decodes their values Secret, removes the storageClass key if present, and patches the Secret back. The migration is idempotent — it skips instances where the key is already absent.

Closes: #2094
Related: #1424

Release note

[kubernetes] Auto-propagate all infrastructure StorageClasses to tenant clusters instead of a single hardcoded `kubevirt` class. The `storageClass` parameter is removed from the kubernetes chart values — discovery is now automatic via Helm `lookup`. An enforcement allowList is generated to restrict tenants to only use propagated StorageClasses. Includes migration 33 to automatically clean up existing deployments.

Summary by CodeRabbit

Release Notes

  • Refactor

    • Storage class configuration is now automatically discovered from your cluster instead of requiring manual specification. The static storageClass parameter has been removed.
    • A migration to version 36 will automatically update your configurations and rebind persistent volumes to the appropriate storage classes.
  • Documentation

    • Removed deprecated storage class parameter from configuration documentation.

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. enhancement New feature or request labels Feb 24, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @mattia-eleuteri, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly improves the management of StorageClasses within tenant Kubernetes clusters by automating their discovery and propagation from the underlying infrastructure. It transitions from a rigid, single static StorageClass configuration to a dynamic system that identifies and makes available all relevant infrastructure StorageClasses. This change not only streamlines deployment but also enhances security by enforcing that tenants can only utilize these propagated classes, ensuring consistency and preventing unauthorized storage provisioning.

Highlights

  • Automatic StorageClass Propagation: Implemented automatic discovery and propagation of all infrastructure StorageClasses to tenant Kubernetes clusters, replacing the previous static storageClass parameter.
  • Dynamic StorageClass Discovery: Utilized Helm's lookup function to dynamically discover all infrastructure StorageClasses, including their default annotations, at reconciliation time.
  • StorageClass Enforcement: Introduced an infraStorageClassEnforcement allowList to restrict tenant clusters to only use the automatically propagated StorageClasses, enhancing security and control.
  • Chart Parameterization Update: The storageClass parameter was removed from the kubernetes chart values and schema, and the kubevirt-csi-node chart now accepts a storageClasses map for dynamic configuration.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • packages/apps/kubernetes/templates/csi/driver-config.yaml
    • Added a new ConfigMap driver-config to define infraStorageClassEnforcement.
    • Populated the allowList within infraStorageClassEnforcement dynamically using discovered StorageClasses or a 'replicated' fallback.
  • packages/apps/kubernetes/templates/helmreleases/csi.yaml
    • Removed the storageClass parameter from the values section.
    • Introduced a storageClasses map, populated by dynamically looking up infrastructure StorageClasses and their default status.
    • Included a fallback to a 'replicated' StorageClass if no infrastructure StorageClasses are found.
  • packages/apps/kubernetes/values.schema.json
    • Removed the storageClass definition from the Kubernetes chart's values schema.
  • packages/apps/kubernetes/values.yaml
    • Deleted the storageClass parameter and its default value from the Kubernetes chart's default values.
  • packages/system/kubevirt-csi-node/templates/deploy.yaml
    • Modified the template to iterate over the new storageClasses map, allowing for the creation of multiple StorageClass resources.
    • Dynamically set the StorageClass name and infraStorageClassName based on the propagated values.
    • Conditionally added the storageclass.kubernetes.io/is-default-class annotation based on the propagated default status.
    • Added infraStorageClassEnforcement to the driver-config ConfigMap, listing the allowed infraStorageClass names.
  • packages/system/kubevirt-csi-node/values.yaml
    • Replaced the single storageClass parameter with a storageClasses map, initialized with a default 'replicated' entry.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 24, 2026

Warning

Rate limit exceeded

@kvaps has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 19 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 128c6498-b0b1-4763-929b-bbe9b4448835

📥 Commits

Reviewing files that changed from the base of the PR and between 7e19133 and 416742f.

📒 Files selected for processing (11)
  • hack/e2e-apps/run-kubernetes.sh
  • packages/apps/kubernetes/README.md
  • packages/apps/kubernetes/templates/csi/driver-config.yaml
  • packages/apps/kubernetes/templates/helmreleases/csi.yaml
  • packages/apps/kubernetes/values.schema.json
  • packages/apps/kubernetes/values.yaml
  • packages/core/platform/images/migrations/migrations/35
  • packages/core/platform/values.yaml
  • packages/system/kubernetes-rd/cozyrds/kubernetes.yaml
  • packages/system/kubevirt-csi-node/templates/deploy.yaml
  • packages/system/kubevirt-csi-node/values.yaml
📝 Walkthrough

Walkthrough

This PR implements auto-propagation of infrastructure StorageClasses to tenant Kubernetes clusters using Helm lookup for dynamic discovery. Changes replace a single hardcoded storageClass parameter with automatic enumeration of all infra StorageClasses, add enforcement policies via ConfigMap, update deployment configurations, and introduce a migration script (version 35→36) to rebind existing PVCs and clean up deprecated values.

Changes

Cohort / File(s) Summary
Storage Class Discovery & Enforcement
packages/apps/kubernetes/templates/csi/driver-config.yaml, packages/apps/kubernetes/templates/helmreleases/csi.yaml
New ConfigMap manifest with templated infraStorageClassEnforcement logic. Updated HelmRelease to replace single storageClass with dynamic lookup-based storageClasses dictionary keyed by name with infraStorageClass and default annotation fields.
Chart Values & Schema
packages/apps/kubernetes/values.schema.json, packages/apps/kubernetes/values.yaml, packages/system/kubernetes-rd/cozyrds/kubernetes.yaml
Removed top-level storageClass property from JSON schema and YAML values files. Removed from Kubernetes ApplicationDefinition openAPISchema and keysOrder.
CSI Node Deployment
packages/system/kubevirt-csi-node/templates/deploy.yaml, packages/system/kubevirt-csi-node/values.yaml
Updated deployment template to generate StorageClass resources dynamically by looping over storageClasses map. Added infraStorageClassEnforcement to driver-config data. Replaced scalar storageClass with nested storageClasses map structure in values.
Documentation & Definitions
packages/apps/kubernetes/README.md
Removed Common Parameters subsection including storageClass documentation row.
Testing Infrastructure
hack/e2e-apps/run-kubernetes.sh
Added StorageClass propagation verification block to validate all tenant StorageClasses use kubevirt CSI provisioner and exactly one default exists. Updated test PVC to reference replicated StorageClass explicitly.
Migration & Versioning
packages/core/platform/values.yaml, packages/core/platform/images/migrations/migrations/35
Bumped migrations.targetVersion to 36. Added comprehensive migration script (223 lines) that removes storageClass from HelmRelease values secrets and rebinds tenant PVCs to auto-discovered storage classes with robust error handling and per-phase logging.

Sequence Diagram(s)

sequenceDiagram
    participant MC as Management<br/>Cluster
    participant HR as HelmRelease<br/>Template
    participant TC as Tenant<br/>Cluster
    participant CSI as kubevirt-csi<br/>driver
    participant PVC as Persistent<br/>Volume

    MC->>HR: Reconcile CSI HelmRelease
    HR->>MC: lookup all StorageClasses
    MC-->>HR: Return discovered SC list
    HR->>HR: Build storageClasses dict<br/>(name, infraSC, default)
    HR->>HR: Generate infraStorageClassEnforcement<br/>allowList
    HR->>TC: Apply StorageClass resources<br/>(per infra SC)
    HR->>TC: Apply driver-config ConfigMap<br/>with enforcement allowList
    TC->>CSI: Load driver config
    CSI->>CSI: Validate PVC requests<br/>against allowList
    Note over CSI: Only propagated SCs allowed
    TC->>PVC: Create/bind with<br/>storageClassName
    CSI->>PVC: Provision volume<br/>via kubevirt provisioner
Loading
sequenceDiagram
    participant Admin as Admin
    participant MigJob as Migration<br/>Job v35→36
    participant HR as HelmRelease<br/>Values Secret
    participant TC as Tenant<br/>Cluster
    participant PV as Persistent<br/>Volume
    participant PVC as Persistent<br/>Volume Claim

    Admin->>MigJob: Trigger migration v35→36
    MigJob->>HR: Find all HR values secrets
    MigJob->>HR: Read & remove storageClass key
    MigJob->>HR: Patch secret with updated YAML
    Note over HR: idempotent removal
    MigJob->>TC: Extract tenant kubeconfig
    MigJob->>TC: List PVCs (storageClassName=kubevirt)
    TC-->>MigJob: Return PVC list
    loop Per PVC
        MigJob->>PV: Get bound PV & extract CSI attrs
        MigJob->>PV: Set Retain reclaim policy
        MigJob->>PVC: Delete PVC
        MigJob->>PV: Patch storageClassName &<br/>remove claimRef
        MigJob->>PVC: Recreate PVC with<br/>discovered SC name
        PVC->>PV: Rebind to PV
        MigJob->>PV: Restore reclaim policy
    end
    MigJob->>MigJob: Update cozystack-version<br/>configmap to v36
    MigJob-->>Admin: Migration complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 StorageClasses now dance and sway,
Auto-discovered come what may!
No hardcoding, lookups take the stage,
Tenant clusters turn a brand new page.
Multi-storage dreams are here to stay,
Hops of joy for this fine day! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main objective: auto-propagating infrastructure StorageClasses to tenant clusters, which is the primary change across the entire changeset.
Linked Issues check ✅ Passed The changes comprehensively implement all requirements from issue #2094: auto-discovery via Helm lookup, same-name propagation to tenants, default annotation preservation, enforcement via infraStorageClassEnforcement ConfigMap, fallback to replicated, and no Go code changes as specified.
Out of Scope Changes check ✅ Passed All changes are directly aligned with the linked issue objectives. The platform version bump (35→36) and migration script are necessary implementation details for backwards compatibility, not out-of-scope additions.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request automates the discovery and propagation of infrastructure StorageClasses to tenant Kubernetes clusters, replacing the previous static storageClass parameter. The changes involve modifications to the kubevirt-csi-node and kubernetes charts, along with the introduction of a new driver-config.yaml ConfigMap for enforcement. The review focuses on ensuring the correctness and maintainability of the implemented changes, particularly around the handling of StorageClasses and the enforcement mechanism.

allowAll: false
allowDefault: false
allowList:
{{- $storageClasses := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This line uses lookup to enumerate StorageClasses. Ensure that the service account used by the FluxCD operator has the necessary permissions to list StorageClasses in the cluster. Missing permissions will cause the lookup to fail, and the tenant cluster will not have the StorageClasses propagated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an issue. FluxCD's helm-controller runs with broad cluster-scoped permissions and can already list non-namespaced resources like StorageClasses. This is how all lookup-based templates work in this project.

Comment on lines +273 to +275
{{- range $name, $mapping := .Values.storageClasses }}
- {{ $mapping.infraStorageClass | quote }}
{{- end }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This loop iterates through the storageClasses to create an allow list. It's important to ensure that the values in .Values.storageClasses are properly validated to prevent injection of arbitrary storage class names, which could bypass the intended enforcement.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an issue. The storageClasses values are populated by lookup in the parent template (csi.yaml), not from user input. The names come directly from the Kubernetes API — there is no injection vector here.

Comment on lines +31 to +42
{{- $storageClasses := dict }}
{{- range $sc := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
{{- $isDefault := false }}
{{- if $sc.metadata.annotations }}
{{- $isDefault = eq (index $sc.metadata.annotations "storageclass.kubernetes.io/is-default-class" | default "false") "true" }}
{{- end }}
{{- $_ := set $storageClasses $sc.metadata.name (dict "infraStorageClass" $sc.metadata.name "default" $isDefault) }}
{{- end }}
{{- if not $storageClasses }}
{{- $_ := set $storageClasses "replicated" (dict "infraStorageClass" "replicated" "default" true) }}
{{- end }}
{{- $storageClasses | toYaml | nindent 6 }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This section dynamically generates the storageClasses map. Consider adding a comment explaining the purpose of each key-value pair within the generated map, especially infraStorageClass and default. This will improve the readability and maintainability of the template.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style-only suggestion. The template logic is straightforward and follows the same patterns used elsewhere in the codebase. Declining to add extra comments here.

Comment on lines +281 to +285
name: {{ $name }}
{{- if $mapping.default }}
annotations:
storageclass.kubernetes.io/is-default-class: "true"
{{- end }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider adding a default value for the storageclass.kubernetes.io/is-default-class annotation in case it's not present. This can prevent unexpected behavior if the annotation is missing.

  name: {{ $name }}
  {{- if eq $mapping.default true }}
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
  {{- end }}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current code already handles this correctly. The {{- if eq $mapping.default true }} conditional only renders the annotation block when .default is explicitly true. When the annotation is absent from the source StorageClass, $mapping.default will be false and the annotation block is simply not rendered — which is the desired behavior.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/apps/kubernetes/templates/csi/driver-config.yaml`:
- Around line 6-17: The template directly uses (lookup ...).items which can be
nil and prevents the fallback allowList entry "replicated" from rendering;
update the infraStorageClassEnforcement block to first assign the lookup result
to a variable (e.g. $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "")
and then safely test $scList and $scList.items (or check len $scList.items)
before ranging into $storageClasses, so that when the lookup is nil the template
falls back to adding - "replicated" under allowList; adjust variable names shown
($storageClasses, $scList, allowList) accordingly.

In `@packages/apps/kubernetes/templates/helmreleases/csi.yaml`:
- Around line 30-42: The template directly accesses (lookup "storage.k8s.io/v1"
"StorageClass" "" "").items which will null-deref during helm template or when
the cluster is unavailable; update the logic around the lookup call (the
$storageClasses population loop) to guard the result by using default/get (or
assign the lookup result to a variable, then use default to an empty dict/list
and use get to read .items) so the loop over storage classes only runs when
items exist and the fallback that sets "replicated" as default still executes;
adjust references to the lookup result, .items, and the $storageClasses
population block accordingly and keep the final $storageClasses | toYaml |
nindent 6 unchanged.

In `@packages/system/kubevirt-csi-node/templates/deploy.yaml`:
- Line 267: The ConfigMap renders a bare key "infraClusterNamespace:" which
produces a null value; update the template in deploy.yaml so the key either
emits an explicit empty string or is omitted when unset—e.g. change the
ConfigMap data entry for infraClusterNamespace to render infraClusterNamespace:
"" or wrap it in a conditional (if .Values.infraClusterNamespace) to only
include the key when a non-empty value is provided; locate the
infraClusterNamespace entry in the ConfigMap block of
packages/system/kubevirt-csi-node/templates/deploy.yaml and apply one of these
fixes.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d8f5083 and 2f3d3cf.

📒 Files selected for processing (6)
  • packages/apps/kubernetes/templates/csi/driver-config.yaml
  • packages/apps/kubernetes/templates/helmreleases/csi.yaml
  • packages/apps/kubernetes/values.schema.json
  • packages/apps/kubernetes/values.yaml
  • packages/system/kubevirt-csi-node/templates/deploy.yaml
  • packages/system/kubevirt-csi-node/values.yaml
💤 Files with no reviewable changes (2)
  • packages/apps/kubernetes/values.schema.json
  • packages/apps/kubernetes/values.yaml

Comment on lines +6 to +17
infraStorageClassEnforcement: |
allowAll: false
allowDefault: false
allowList:
{{- $storageClasses := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
{{- if $storageClasses }}
{{- range $storageClasses }}
- {{ .metadata.name | quote }}
{{- end }}
{{- else }}
- "replicated"
{{- end }}
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find and examine the file
find . -path "*/packages/apps/kubernetes/templates/csi/driver-config.yaml" -type f

Repository: cozystack/cozystack

Length of output: 122


🏁 Script executed:

# Check the repository structure
ls -la packages/apps/ 2>/dev/null || echo "packages/apps/ not found"

Repository: cozystack/cozystack

Length of output: 1436


🏁 Script executed:

# If the file exists, let's see its full content with line numbers
cat -n packages/apps/kubernetes/templates/csi/driver-config.yaml 2>/dev/null | head -30

Repository: cozystack/cozystack

Length of output: 592


🏁 Script executed:

# Check if charts/ directory exists with vendored charts
find packages/apps/kubernetes -name "Chart.yaml" -o -name "charts" -type d 2>/dev/null | head -10

Repository: cozystack/cozystack

Length of output: 130


🏁 Script executed:

# Check the Chart.yaml to understand ownership
cat -n packages/apps/kubernetes/Chart.yaml

Repository: cozystack/cozystack

Length of output: 348


🏁 Script executed:

# Check what's in the charts/ directory (vendored charts)
ls -la packages/apps/kubernetes/charts/

Repository: cozystack/cozystack

Length of output: 257


🏁 Script executed:

# Check if there are other driver-config.yaml files in vendored charts
find packages/apps/kubernetes/charts -name "driver-config.yaml" 2>/dev/null

Repository: cozystack/cozystack

Length of output: 45


🌐 Web query:

Helm lookup function nil return behavior .items access error

💡 Result:

In Helm, lookup can legitimately give you a value where .items isn’t safe to access, for two common reasons:

  1. You asked for a single object, not a list
    If you pass a non-empty name, lookup returns a single object (dict/map), not a list—so there is no .items field to range over. To get a list (with .items), pass an empty name "". The Helm docs explicitly show that only “list” lookups have an .items field. [1]

  2. lookup returned an “empty value” (often seen as nil) because Helm didn’t/ couldn’t talk to the cluster
    Helm is not supposed to contact the API server during helm template or most --dry-run modes; in those cases lookup returns an empty value (often observed as nil/empty), and then (...).items triggers a nil-pointer/field-access error. Use --dry-run=server when you need live lookups, or ensure you’re running against a real cluster connection. [1][2][4]

Safe patterns

Range safely over a list lookup:

{{- $list := (lookup "v1" "ConfigMap" .Release.Namespace "" | default (dict "items" (list))) -}}
{{- range $cm := $list.items -}}
  {{/* ... */}}
{{- end -}}

default treats nil as “empty”, so it will substitute your fallback dict. [3]

Or guard with if:

{{- $list := lookup "v1" "ConfigMap" .Release.Namespace "" -}}
{{- if $list -}}
  {{- range $cm := $list.items -}} ... {{- end -}}
{{- end -}}

If you share your exact lookup call (especially whether name is "" or not) and the command/environment (Helm vs ArgoCD/Harness/CI), I can point to the specific cause in your case.


🌐 Web query:

Helm template default filter dict list functions syntax

💡 Result:

default (aka “fallback”) syntax

  • Direct call: {{ default "foo" .Values.bar }}
  • Pipe form (same thing): {{ .Values.bar | default "foo" }}
    default returns the second value unless it’s “empty” (e.g., 0, "", [], {}, false, or nil). [1]

dict syntax

  • Create a map from key/value pairs: {{- $m := dict "name" "alice" "enabled" true -}} [1]
  • Empty map: {{- $empty := dict -}} (often written as (dict) when used as an argument) [1][2]

list syntax

  • Create a list: {{- $l := list 1 2 3 -}}[1 2 3] [1]
  • When passing a list as a single argument, wrap it: {{ someFunc (list "a" "b") }} [1]

Common combined pattern (avoids nil errors)

{{- if index (default (dict) .Values.outer.inner) "someKey" -}}
...
{{- end -}}

Meaning: if .Values.outer.inner is nil/empty, replace it with an empty dict so index ... "someKey" won’t blow up. [2]

Sources: Helm template function list [1]; Stack Overflow explanation of default (dict) usage [2].


Protect .items access on lookup result to ensure fallback list renders.

Directly accessing .items on the lookup call fails when the API lookup returns nil (common in helm template without --dry-run=server), preventing the fallback to "replicated" from rendering.

🔧 Proposed fix
-    {{- $storageClasses := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
+    {{- $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "" | default dict }}
+    {{- $storageClasses := (get $scList "items" | default list) }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/apps/kubernetes/templates/csi/driver-config.yaml` around lines 6 -
17, The template directly uses (lookup ...).items which can be nil and prevents
the fallback allowList entry "replicated" from rendering; update the
infraStorageClassEnforcement block to first assign the lookup result to a
variable (e.g. $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "") and
then safely test $scList and $scList.items (or check len $scList.items) before
ranging into $storageClasses, so that when the lookup is nil the template falls
back to adding - "replicated" under allowList; adjust variable names shown
($storageClasses, $scList, allowList) accordingly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not applicable. These templates are rendered exclusively by FluxCD's helm-controller against a live cluster, never via standalone helm template. The (lookup ...).items pattern is used throughout this codebase:

  • packages/apps/kubernetes/templates/cluster.yaml:376(lookup "cluster.x-k8s.io/v1beta1" "MachineSet" ...).items
  • packages/apps/kubernetes/templates/cluster.yaml:378(lookup "infrastructure.cluster.x-k8s.io/v1alpha1" "KubevirtMachineTemplate" ...).items
  • packages/system/linstor-scheduler/ — 4 occurrences of (lookup ...).items

Adding default dict guards would be inconsistent with the rest of the project and unnecessary for the actual deployment path.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the claimed usages of (lookup ...).items pattern in the codebase

echo "=== Checking packages/apps/kubernetes/templates/cluster.yaml around lines 376-378 ==="
sed -n '370,385p' packages/apps/kubernetes/templates/cluster.yaml | grep -n "lookup"

echo -e "\n=== Checking packages/system/linstor-scheduler/ for lookup patterns ==="
rg -n '\.items\s*}}' packages/system/linstor-scheduler/ -A 1 -B 1

echo -e "\n=== Finding all occurrences of (lookup ...).items pattern across the repository ==="
rg -nP '\(lookup\s+[^)]+\)\.items' packages/

Length of output: 3626


@mattia-eleuteri, thank you for the clarification! I've verified your claims and confirmed that the (lookup ...).items pattern is indeed used consistently throughout the codebase:

  • packages/apps/kubernetes/templates/cluster.yaml:376 — MachineSet lookup
  • packages/apps/kubernetes/templates/cluster.yaml:378 — KubevirtMachineTemplate lookup
  • packages/system/linstor-scheduler/ — 4 occurrences in _helpers.tpl

You're absolutely right that adding default dict guards would be inconsistent with the established patterns in this repository. Since FluxCD's helm-controller always renders against a live cluster, the concern about nil lookup results doesn't apply to your deployment model. I appreciate the context!

ʕ •ᴥ•ʔ 📚✨


✏️ Learnings added
Learnt from: mattia-eleuteri
Repo: cozystack/cozystack PR: 2095
File: packages/apps/kubernetes/templates/csi/driver-config.yaml:6-17
Timestamp: 2026-02-24T18:01:06.090Z
Learning: In the cozystack/cozystack repository, Helm templates are rendered exclusively by FluxCD's helm-controller against a live cluster, never via standalone `helm template`. Therefore, the `(lookup ...).items` pattern is safe and used consistently throughout the codebase (e.g., in packages/apps/kubernetes/templates/cluster.yaml and packages/system/linstor-scheduler/). Adding `default dict` guards to protect against nil lookup results is unnecessary and would be inconsistent with project conventions.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: lexfrei
Repo: cozystack/cozystack PR: 2010
File: packages/core/installer/values.yaml:10-26
Timestamp: 2026-02-10T10:44:34.963Z
Learning: In the cozystack/cozystack repository, the Makefile uses `helm template -s` to selectively render templates for different modes (talos, generic, hosted). Because Helm evaluates all templates regardless of `-s` filtering, using `required` checks in mode-specific templates (e.g., generic-mode templates) would break the build for other modes. Therefore, avoid suggesting `required` validation for mode-specific values in Helm templates in this repository.

Learnt from: kvaps
Repo: cozystack/cozystack PR: 1698
File: packages/core/flux-aio/templates/_helpers.tpl:3-3
Timestamp: 2025-12-10T11:40:12.476Z
Learning: Helpers taken from flux-operator (such as kubernetes_envs helpers) should be kept as-is and not modified, even if they appear to have unconventional patterns like using `dict` as a default for array fields. These have been battle-tested upstream.

Comment on lines +30 to +42
storageClasses:
{{- $storageClasses := dict }}
{{- range $sc := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
{{- $isDefault := false }}
{{- if $sc.metadata.annotations }}
{{- $isDefault = eq (index $sc.metadata.annotations "storageclass.kubernetes.io/is-default-class" | default "false") "true" }}
{{- end }}
{{- $_ := set $storageClasses $sc.metadata.name (dict "infraStorageClass" $sc.metadata.name "default" $isDefault) }}
{{- end }}
{{- if not $storageClasses }}
{{- $_ := set $storageClasses "replicated" (dict "infraStorageClass" "replicated" "default" true) }}
{{- end }}
{{- $storageClasses | toYaml | nindent 6 }}
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Read the actual file to see full context
cat -n packages/apps/kubernetes/templates/helmreleases/csi.yaml | head -50

Repository: cozystack/cozystack

Length of output: 2150


🏁 Script executed:

# Check if file exists and get basic info
ls -la packages/apps/kubernetes/templates/helmreleases/csi.yaml
wc -l packages/apps/kubernetes/templates/helmreleases/csi.yaml

Repository: cozystack/cozystack

Length of output: 228


🏁 Script executed:

# Search for other usages of lookup in the codebase to understand patterns
rg 'lookup\s*"' packages/apps/kubernetes/templates/ -A 2 -B 1

Repository: cozystack/cozystack

Length of output: 10922


Guard lookup before accessing .items to preserve fallback behavior.

When lookup returns nil (e.g., during helm template or cluster unavailability), the direct access to (lookup ...).items will error before the replicated fallback (lines 39-41) can execute. Use default and get to safely access items.

🔧 Proposed fix
-    {{- $storageClasses := dict }}
-    {{- range $sc := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
+    {{- $storageClasses := dict }}
+    {{- $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "" | default dict }}
+    {{- range $sc := (get $scList "items" | default list) }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/apps/kubernetes/templates/helmreleases/csi.yaml` around lines 30 -
42, The template directly accesses (lookup "storage.k8s.io/v1" "StorageClass" ""
"").items which will null-deref during helm template or when the cluster is
unavailable; update the logic around the lookup call (the $storageClasses
population loop) to guard the result by using default/get (or assign the lookup
result to a variable, then use default to an empty dict/list and use get to read
.items) so the loop over storage classes only runs when items exist and the
fallback that sets "replicated" as default still executes; adjust references to
the lookup result, .items, and the $storageClasses population block accordingly
and keep the final $storageClasses | toYaml | nindent 6 unchanged.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above — not applicable. These templates are only rendered by FluxCD against a live cluster, not via helm template. The (lookup ...).items pattern is the established convention across the codebase. Adding nil guards here would be inconsistent with the rest of the project.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for lookup patterns that directly access .items to verify the established convention
rg -n '\.items' packages/ -A 1 -B 2 | grep -E '(lookup|\.items)' | head -40

Length of output: 6624


🏁 Script executed:

#!/bin/bash
# Search specifically for the pattern of lookup followed by .items access
rg -nP 'lookup\s+[^)]+\)\.items' packages/ -C 2

Length of output: 4573


@mattia-eleuteri, you're absolutely right—thank you for the clarification! I can see from the search results that the (lookup ...).items pattern is consistently used throughout the codebase (in linstor-scheduler templates, cluster.yaml, driver-config.yaml, etc.), and since these templates are exclusively rendered by FluxCD against live clusters, the nil-guard concern doesn't apply. I appreciate you pointing out the project's conventions.


✏️ Learnings added
Learnt from: mattia-eleuteri
Repo: cozystack/cozystack PR: 2095
File: packages/apps/kubernetes/templates/helmreleases/csi.yaml:30-42
Timestamp: 2026-02-24T18:01:02.525Z
Learning: In the cozystack/cozystack repository, Helm templates under `packages/apps/kubernetes/templates/` are only rendered by FluxCD against live clusters (not via `helm template`). The established convention is to directly access `(lookup "..." "" "" "").items` without nil guards. This pattern is used consistently across multiple templates (e.g., linstor-scheduler/_helpers.tpl, cluster.yaml, driver-config.yaml, helmreleases/csi.yaml), so adding defensive checks would be inconsistent with the codebase style.
<!-- [/add_learning]

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: sircthulhu
Repo: cozystack/cozystack PR: 2068
File: packages/system/cozystack-basics/templates/tenant-root.yaml:7-11
Timestamp: 2026-02-18T15:45:24.589Z
Learning: In the cozystack/cozystack repository, for the tenant-root HelmRelease in `packages/system/cozystack-basics/templates/tenant-root.yaml`, users are only supposed to modify the `values` field, not `valuesFrom`. Therefore, preserving only `spec.values` during reconciliation is sufficient and intentional.

Learnt from: mattia-eleuteri
Repo: cozystack/cozystack PR: 1988
File: packages/extra/external-dns/templates/external-dns.yaml:22-31
Timestamp: 2026-02-12T11:03:15.183Z
Learning: In the cozystack/cozystack repository, `valuesFrom` is not correctly supported with Cozystack's HelmRelease implementation. Inline credential values in HelmRelease specs are the expected pattern, and suggestions to move credentials to `valuesFrom` with Secret references should be avoided.

Learnt from: lexfrei
Repo: cozystack/cozystack PR: 2010
File: packages/core/installer/values.yaml:10-26
Timestamp: 2026-02-10T10:44:34.963Z
Learning: In the cozystack/cozystack repository, the Makefile uses `helm template -s` to selectively render templates for different modes (talos, generic, hosted). Because Helm evaluates all templates regardless of `-s` filtering, using `required` checks in mode-specific templates (e.g., generic-mode templates) would break the build for other modes. Therefore, avoid suggesting `required` validation for mode-specific values in Helm templates in this repository.

Learnt from: kvaps
Repo: cozystack/cozystack PR: 1698
File: packages/core/flux-aio/templates/_helpers.tpl:3-3
Timestamp: 2025-12-10T11:40:12.476Z
Learning: Helpers taken from flux-operator (such as kubernetes_envs helpers) should be kept as-is and not modified, even if they appear to have unconventional patterns like using `dict` as a default for array fields. These have been battle-tested upstream.

Learnt from: lexfrei
Repo: cozystack/cozystack PR: 2058
File: packages/apps/harbor/templates/harbor.yaml:53-67
Timestamp: 2026-02-16T18:39:00.608Z
Learning: In the cozystack/cozystack repository, standard Flux HelmRelease v2 resources correctly support `valuesFrom` with `targetPath` for injecting Secret values. This is processed by Flux's helm-controller. The retry mechanism (`retries: -1`) handles cases where secrets don't exist yet on first install (e.g., COSI-provisioned bucket secrets).

@mattia-eleuteri mattia-eleuteri force-pushed the feat/auto-propagate-storageclasses branch from 2f3d3cf to 78891d1 Compare February 24, 2026 17:54
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Feb 24, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
hack/e2e-apps/run-kubernetes.sh (1)

271-285: ⚠️ Potential issue | 🟡 Minor

Avoid hard-coding replicated in the RWX PVC.
Auto-propagation doesn’t guarantee a replicated class; environments with different defaults will fail this test. Prefer the default propagated class (or first available) and fall back to replicated.

🔧 Proposed fix
@@
-  # Test RWX NFS mount in tenant cluster (uses kubevirt CSI driver with RWX support)
+  # Test RWX NFS mount in tenant cluster (uses kubevirt CSI driver with RWX support)
+  default_sc=$(kubectl --kubeconfig "tenantkubeconfig-${test_name}" get sc \
+    -o jsonpath='{range .items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")]}{.metadata.name}{"\n"}{end}' | head -n1)
+  if [ -z "$default_sc" ]; then
+    default_sc=$(echo "$sc_names" | awk '{print $1}')
+  fi
+  default_sc=${default_sc:-replicated}
@@
-  storageClassName: replicated
+  storageClassName: ${default_sc}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hack/e2e-apps/run-kubernetes.sh` around lines 271 - 285, The PVC creation
hard-codes storageClassName: replicated which can fail if that class doesn't
exist; instead, before creating nfs-test-pvc (namespace tenant-test, name
nfs-test-pvc), detect a valid storage class by querying Kubernetes for the
default propagated class (storageclass.kubernetes.io/is-default-class=true) or
the first available storageclass and set a variable (e.g. STORAGE_CLASS) and
then use that variable in the PVC manifest; if no storageclass is found, fall
back to "replicated". Update the script around the PVC creation to compute
STORAGE_CLASS using kubectl (using the test_name kubeconfig) and substitute that
variable into the here-doc storageClassName field.
♻️ Duplicate comments (3)
packages/system/kubevirt-csi-node/templates/deploy.yaml (1)

267-267: ⚠️ Potential issue | 🟠 Major

ConfigMap data value renders null; use an explicit empty string.
infraClusterNamespace: with no value becomes null, which is invalid for ConfigMap data (expects string values).

🔧 Proposed fix
-  infraClusterNamespace:
+  infraClusterNamespace: ""

Verification (ensure the rendered manifest has no null value):

#!/bin/bash
set -euo pipefail
helm template kubevirt-csi-node packages/system/kubevirt-csi-node > /tmp/kubevirt-csi-node.yaml
if rg -n 'infraClusterNamespace:\s*$' /tmp/kubevirt-csi-node.yaml; then
  echo "infraClusterNamespace rendered as null"
  exit 1
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/system/kubevirt-csi-node/templates/deploy.yaml` at line 267, The
ConfigMap key infraClusterNamespace in the deploy.yaml template is rendering as
null because it has no explicit value; update the template so the ConfigMap data
emits an explicit empty string for infraClusterNamespace (e.g., render the value
from .Values.infraClusterNamespace with a default of "" or otherwise ensure it
writes ""), ensuring the rendered manifest contains infraClusterNamespace: ""
rather than infraClusterNamespace: and thus avoids a null value in the ConfigMap
data.
packages/apps/kubernetes/templates/helmreleases/csi.yaml (1)

30-42: ⚠️ Potential issue | 🟠 Major

Guard lookup before accessing .items to keep fallback working.
When lookup returns nil (e.g., helm template/dry-run), .items deref errors and the replicated fallback never renders.

🔧 Proposed fix
     storageClasses:
     {{- $storageClasses := dict }}
-    {{- range $sc := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
+    {{- $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "" | default dict }}
+    {{- range $sc := (get $scList "items" | default list) }}

Verification (render without a live cluster and confirm no template error):

#!/bin/bash
set -euo pipefail
helm template test packages/apps/kubernetes --dry-run >/tmp/render.yaml
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/apps/kubernetes/templates/helmreleases/csi.yaml` around lines 30 -
42, The template dereferences `.items` on the result of lookup which can be nil
during dry-run; update the block using lookup by assigning its result to a
variable (e.g., $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "") and
guard access with a conditional (e.g., `if $scList` and/or `if $scList.items`)
before iterating `range $sc := $scList.items`, so that when lookup returns nil
the code falls through to the existing `replicated` fallback; keep the rest of
the logic using $storageClasses, $isDefault and set unchanged.
packages/apps/kubernetes/templates/csi/driver-config.yaml (1)

6-17: ⚠️ Potential issue | 🟠 Major

Protect .items access on the lookup result.
lookup can be nil in helm template/dry-run, causing a template error and skipping the fallback allowList.

🔧 Proposed fix
-    {{- $storageClasses := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
+    {{- $scList := lookup "storage.k8s.io/v1" "StorageClass" "" "" | default dict }}
+    {{- $storageClasses := (get $scList "items" | default list) }}
     {{- if $storageClasses }}

Verification (render without a live cluster and confirm no template error):

#!/bin/bash
set -euo pipefail
helm template test packages/apps/kubernetes --dry-run >/tmp/render.yaml
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/apps/kubernetes/templates/csi/driver-config.yaml` around lines 6 -
17, The template accesses .items on the lookup result without guarding for a nil
return, which fails during helm template/dry-run; update
infraStorageClassEnforcement to first capture the lookup result (the value bound
to $storageClasses or similar) and check that it is non-nil (e.g., use an if/and
or separate if) before iterating over or referencing .items so the else fallback
("replicated") is used when lookup returns nil; ensure you reference the same
variable name ($storageClasses) and protect all uses of .items from a nil
lookup.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@hack/e2e-apps/run-kubernetes.sh`:
- Around line 271-285: The PVC creation hard-codes storageClassName: replicated
which can fail if that class doesn't exist; instead, before creating
nfs-test-pvc (namespace tenant-test, name nfs-test-pvc), detect a valid storage
class by querying Kubernetes for the default propagated class
(storageclass.kubernetes.io/is-default-class=true) or the first available
storageclass and set a variable (e.g. STORAGE_CLASS) and then use that variable
in the PVC manifest; if no storageclass is found, fall back to "replicated".
Update the script around the PVC creation to compute STORAGE_CLASS using kubectl
(using the test_name kubeconfig) and substitute that variable into the here-doc
storageClassName field.

---

Duplicate comments:
In `@packages/apps/kubernetes/templates/csi/driver-config.yaml`:
- Around line 6-17: The template accesses .items on the lookup result without
guarding for a nil return, which fails during helm template/dry-run; update
infraStorageClassEnforcement to first capture the lookup result (the value bound
to $storageClasses or similar) and check that it is non-nil (e.g., use an if/and
or separate if) before iterating over or referencing .items so the else fallback
("replicated") is used when lookup returns nil; ensure you reference the same
variable name ($storageClasses) and protect all uses of .items from a nil
lookup.

In `@packages/apps/kubernetes/templates/helmreleases/csi.yaml`:
- Around line 30-42: The template dereferences `.items` on the result of lookup
which can be nil during dry-run; update the block using lookup by assigning its
result to a variable (e.g., $scList := lookup "storage.k8s.io/v1" "StorageClass"
"" "") and guard access with a conditional (e.g., `if $scList` and/or `if
$scList.items`) before iterating `range $sc := $scList.items`, so that when
lookup returns nil the code falls through to the existing `replicated` fallback;
keep the rest of the logic using $storageClasses, $isDefault and set unchanged.

In `@packages/system/kubevirt-csi-node/templates/deploy.yaml`:
- Line 267: The ConfigMap key infraClusterNamespace in the deploy.yaml template
is rendering as null because it has no explicit value; update the template so
the ConfigMap data emits an explicit empty string for infraClusterNamespace
(e.g., render the value from .Values.infraClusterNamespace with a default of ""
or otherwise ensure it writes ""), ensuring the rendered manifest contains
infraClusterNamespace: "" rather than infraClusterNamespace: and thus avoids a
null value in the ConfigMap data.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f3d3cf and 78891d1.

📒 Files selected for processing (9)
  • hack/e2e-apps/run-kubernetes.sh
  • packages/apps/kubernetes/README.md
  • packages/apps/kubernetes/templates/csi/driver-config.yaml
  • packages/apps/kubernetes/templates/helmreleases/csi.yaml
  • packages/apps/kubernetes/values.schema.json
  • packages/apps/kubernetes/values.yaml
  • packages/system/kubernetes-rd/cozyrds/kubernetes.yaml
  • packages/system/kubevirt-csi-node/templates/deploy.yaml
  • packages/system/kubevirt-csi-node/values.yaml
💤 Files with no reviewable changes (3)
  • packages/apps/kubernetes/README.md
  • packages/apps/kubernetes/values.schema.json
  • packages/apps/kubernetes/values.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/system/kubevirt-csi-node/values.yaml

@mattia-eleuteri mattia-eleuteri force-pushed the feat/auto-propagate-storageclasses branch from b955a35 to 9883e5c Compare February 24, 2026 18:29
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@hack/e2e-apps/run-kubernetes.sh`:
- Around line 157-163: The script currently verifies there is exactly one
default StorageClass using the jsonpath kubectl call (setting default_count) but
elsewhere hardcodes the name "replicated"; change the code to capture the
default StorageClass name from the same jsonpath query (e.g., into a variable
like default_sc_name) after asserting default_count == 1, and replace any
hardcoded "replicated" usages with that default_sc_name (apply the same fix for
the later block around the other StorageClass check). Ensure you keep the
existing exact-one-default assertion (default_count) and use the discovered
default_sc_name for subsequent operations.

In `@packages/core/platform/images/migrations/migrations/33`:
- Around line 59-65: The kubectl jsonpath used to read the secret value (the
command that sets VALUES_YAML using VALUES_KEY) fails when VALUES_KEY contains
dots because jsonpath treats dots as separators; update the two kubectl jsonpath
expressions that reference "{.data.${VALUES_KEY}}" and
"{.stringData.${VALUES_KEY}}" to use bracket notation that escapes the key
(e.g., {.data[\"<key>\"]} and {.stringData[\"<key>\"]}) so keys with dots are
read correctly; locate the lines that build VALUES_YAML (the kubectl -o jsonpath
calls) and replace their jsonpath patterns to use the quoted bracket form
consistent with the jq patching later.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 78891d1 and 9883e5c.

📒 Files selected for processing (11)
  • hack/e2e-apps/run-kubernetes.sh
  • packages/apps/kubernetes/README.md
  • packages/apps/kubernetes/templates/csi/driver-config.yaml
  • packages/apps/kubernetes/templates/helmreleases/csi.yaml
  • packages/apps/kubernetes/values.schema.json
  • packages/apps/kubernetes/values.yaml
  • packages/core/platform/images/migrations/migrations/33
  • packages/core/platform/values.yaml
  • packages/system/kubernetes-rd/cozyrds/kubernetes.yaml
  • packages/system/kubevirt-csi-node/templates/deploy.yaml
  • packages/system/kubevirt-csi-node/values.yaml
💤 Files with no reviewable changes (3)
  • packages/apps/kubernetes/values.schema.json
  • packages/apps/kubernetes/README.md
  • packages/apps/kubernetes/values.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/system/kubernetes-rd/cozyrds/kubernetes.yaml

Comment on lines +157 to +163
# Verify exactly one default StorageClass is set
default_count=$(kubectl --kubeconfig "tenantkubeconfig-${test_name}" get sc \
-o jsonpath='{range .items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")]}{.metadata.name}{"\n"}{end}' | grep -c .)
if [ "$default_count" -ne 1 ]; then
echo "Expected exactly 1 default StorageClass, found $default_count" >&2
exit 1
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid hardcoding replicated; use the discovered default StorageClass.

With dynamic propagation, the default StorageClass name may differ, so hardcoding replicated makes the test brittle. Use the annotated default class name (and keep the “exactly one default” assertion if that’s still desired).

🛠️ Proposed fix
   # Verify exactly one default StorageClass is set
-  default_count=$(kubectl --kubeconfig "tenantkubeconfig-${test_name}" get sc \
+  default_scs=$(kubectl --kubeconfig "tenantkubeconfig-${test_name}" get sc \
     -o jsonpath='{range .items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")]}{.metadata.name}{"\n"}{end}' | grep -c .)
-  if [ "$default_count" -ne 1 ]; then
-    echo "Expected exactly 1 default StorageClass, found $default_count" >&2
+  default_count=$(echo "$default_scs" | grep -c .)
+  if [ "$default_count" -ne 1 ]; then
+    echo "Expected exactly 1 default StorageClass, found $default_count" >&2
     exit 1
   fi
   echo "Default StorageClass is correctly set"
+  default_sc=$(echo "$default_scs" | head -n1)
   storageClassName: replicated
+  storageClassName: ${default_sc}

Also applies to: 271-281

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hack/e2e-apps/run-kubernetes.sh` around lines 157 - 163, The script currently
verifies there is exactly one default StorageClass using the jsonpath kubectl
call (setting default_count) but elsewhere hardcodes the name "replicated";
change the code to capture the default StorageClass name from the same jsonpath
query (e.g., into a variable like default_sc_name) after asserting default_count
== 1, and replace any hardcoded "replicated" usages with that default_sc_name
(apply the same fix for the later block around the other StorageClass check).
Ensure you keep the existing exact-one-default assertion (default_count) and use
the discovered default_sc_name for subsequent operations.

Comment on lines +59 to +65
# --- Decode current values ---
VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
-o jsonpath="{.data.${VALUES_KEY}}" 2>/dev/null | base64 -d 2>/dev/null || true)
if [ -z "$VALUES_YAML" ]; then
VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
-o jsonpath="{.stringData.${VALUES_KEY}}" 2>/dev/null || true)
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find the migration file
find . -path "*/packages/core/platform/images/migrations/migrations/33" -type f

Repository: cozystack/cozystack

Length of output: 119


🏁 Script executed:

# Also search more broadly for the migrations directory
fd -t f "33" packages/core/platform/images/migrations/ 2>/dev/null | head -20

Repository: cozystack/cozystack

Length of output: 117


🏁 Script executed:

# List what's in the migrations directory
ls -la packages/core/platform/images/migrations/ 2>/dev/null || echo "Path not found"

Repository: cozystack/cozystack

Length of output: 360


🏁 Script executed:

# Read the migration script file
cat -n packages/core/platform/images/migrations/migrations/33

Repository: cozystack/cozystack

Length of output: 4619


🏁 Script executed:

# Also check the run-migrations.sh to understand the context
cat -n packages/core/platform/images/migrations/run-migrations.sh

Repository: cozystack/cozystack

Length of output: 1713


Fix kubectl jsonpath bracket notation for keys with dots.

kubectl -o jsonpath="{.data.${VALUES_KEY}}" fails when the key contains dots (default values.yaml), as jsonpath treats dots as path separators. This causes VALUES_YAML to remain empty and the migration to skip valid instances. The .stringData fallback won't help on live Secrets. The script later uses jq correctly with bracket notation at line 89-91 for patching; apply the same pattern here for reading.

🛠️ Proposed fix
-  VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
-    -o jsonpath="{.data.${VALUES_KEY}}" 2>/dev/null | base64 -d 2>/dev/null || true)
-  if [ -z "$VALUES_YAML" ]; then
-    VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
-      -o jsonpath="{.stringData.${VALUES_KEY}}" 2>/dev/null || true)
-  fi
+  SECRET_JSON=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" -o json 2>/dev/null)
+  VALUES_B64=$(echo "$SECRET_JSON" | jq -r --arg key "$VALUES_KEY" '.data[$key] // empty')
+  VALUES_YAML=""
+  if [ -n "$VALUES_B64" ]; then
+    VALUES_YAML=$(echo "$VALUES_B64" | base64 -d 2>/dev/null || true)
+  fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/platform/images/migrations/migrations/33` around lines 59 - 65,
The kubectl jsonpath used to read the secret value (the command that sets
VALUES_YAML using VALUES_KEY) fails when VALUES_KEY contains dots because
jsonpath treats dots as separators; update the two kubectl jsonpath expressions
that reference "{.data.${VALUES_KEY}}" and "{.stringData.${VALUES_KEY}}" to use
bracket notation that escapes the key (e.g., {.data[\"<key>\"]} and
{.stringData[\"<key>\"]}) so keys with dots are read correctly; locate the lines
that build VALUES_YAML (the kubectl -o jsonpath calls) and replace their
jsonpath patterns to use the quoted bracket form consistent with the jq patching
later.

Copy link
Copy Markdown
Member

@kvaps kvaps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this great contribution! Auto-propagating infrastructure StorageClasses to tenant clusters is a really nice improvement.

However, I see a breaking change that needs to be addressed before merging. Currently, the kubevirt-csi-node chart creates a single StorageClass named kubevirt inside tenant clusters. After this PR, it will create StorageClasses named after infrastructure classes (e.g., replicated), and the kubevirt StorageClass will disappear since it's no longer generated by the template.

This will break existing workloads in tenant clusters:

  • StatefulSets that reference storageClassName: kubevirt in their volumeClaimTemplates won't be able to scale up or create new replicas
  • Any manifests with hardcoded storageClassName: kubevirt will fail to provision new volumes
  • Already bound PVCs will continue to work, but the dangling reference is still undesirable

The migration script (33) only removes the storageClass key from HelmRelease values Secrets on the management cluster side. We need an additional migration step that updates the storageClassName for existing PVCs inside each tenant Kubernetes cluster.

I'd love to include this in the 1.0 release since it changes existing behavior, but we've already cut the release candidate and there isn't enough time to properly test the migration path for tenant-side PVCs. Let's target this for 1.1 instead, so we can give it the testing it deserves.

@mattia-eleuteri
Copy link
Copy Markdown
Contributor Author

Thanks for the review and the thoughtful feedback!

You're absolutely right that this is a breaking change — the kubevirt StorageClass disappearing from tenant clusters would break existing workloads.

I looked into the tenant-side PVC migration you mentioned, and there's a blocker: PVC.spec.storageClassName is immutable in Kubernetes (as are StatefulSet volumeClaimTemplates). So we can't patch existing PVCs to reference the new StorageClass names.

The approach I had in mind to address this is a backwards-compatibility shim in the Helm template: after auto-discovering infra StorageClasses, the template would always inject a kubevirt entry (unless an infra SC already has that name) that maps to the default infra StorageClass. This way:

  • Existing workloads with storageClassName: kubevirt keep working
  • New workloads can use the real infra SC names
  • The kubevirt SC is non-default, so new PVCs naturally use the propagated names

That said, I completely agree this deserves proper testing before merging. Happy to target 1.1 as you suggested — that gives us time to validate the full migration path (compat shim + migration 33 for management-side cleanup) thoroughly.

Let me know if the compat shim approach sounds reasonable to you, and I'll keep the PR updated for when 1.1 development starts.

@kvaps
Copy link
Copy Markdown
Member

kvaps commented Feb 25, 2026 via email

@mattia-eleuteri mattia-eleuteri force-pushed the feat/auto-propagate-storageclasses branch from 4865cff to 7e19133 Compare March 3, 2026 14:22
@mattia-eleuteri
Copy link
Copy Markdown
Contributor Author

hey @kvaps
thanks for the review, I added another step in the migration script that goes in this direction.
Please, give me your opinion about it whenever you have time.

we could organize testing on our lab infra BTW.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/apps/kubernetes/templates/helmreleases/csi.yaml`:
- Around line 31-41: The template currently builds $storageClasses from
discovered StorageClass names but doesn't add a compatibility alias for
"kubevirt", which can break provisioning when existing
volumeClaimTemplates.spec.storageClassName is set to kubevirt; update the
generation logic around $storageClasses (where lookup "storage.k8s.io/v1"
"StorageClass" is iterated and $_ := set $storageClasses is called) to ensure
you always set a "kubevirt" key mapping to an appropriate infraStorageClass
(e.g., point "kubevirt" to an existing infraStorageClass like "replicated" or
the discovered default) so lookups for kubevirt succeed during
scale-out/recreate flows.

In `@packages/core/platform/images/migrations/migrations/35`:
- Around line 42-45: The jsonpath used later to read secret data fails for keys
that contain dots; update any kubectl jsonpath expressions that reference secret
data using the VALUES_KEY variable (looks for patterns like
{.data.${VALUES_KEY}}) to use bracket notation instead (e.g.
{.data['${VALUES_KEY}']}) so dotted keys (for example "values.yaml" or other
dotted secret keys) are accessed correctly; change the jsonpath usages that
consume VALUES_KEY and ensure the VALUES_SECRET/VALUES_KEY variable names remain
unchanged.
- Around line 60-64: The JSONPath reads using kubectl with
"{.data.${VALUES_KEY}}" are unsafe for dotted secret keys (VALUES_KEY like
"values.yaml"); replace those jsonpath reads with a safe JSON read via kubectl
-o json piped into jq using an index lookup with --arg: call kubectl -n
"$NAMESPACE" get secret "$VALUES_SECRET" -o json | jq -r --arg key "$VALUES_KEY"
'.data[$key] // .stringData[$key]' and then base64-decode the result into
VALUES_YAML (same approach used for writes), updating the code paths that set
VALUES_YAML (the kubectl/jsonpath lines that reference VALUES_KEY,
VALUES_SECRET, NAMESPACE) so dotted keys are handled correctly; apply the same
fix to migrations 29 and 30.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9883e5c and 7e19133.

📒 Files selected for processing (11)
  • hack/e2e-apps/run-kubernetes.sh
  • packages/apps/kubernetes/README.md
  • packages/apps/kubernetes/templates/csi/driver-config.yaml
  • packages/apps/kubernetes/templates/helmreleases/csi.yaml
  • packages/apps/kubernetes/values.schema.json
  • packages/apps/kubernetes/values.yaml
  • packages/core/platform/images/migrations/migrations/35
  • packages/core/platform/values.yaml
  • packages/system/kubernetes-rd/cozyrds/kubernetes.yaml
  • packages/system/kubevirt-csi-node/templates/deploy.yaml
  • packages/system/kubevirt-csi-node/values.yaml
💤 Files with no reviewable changes (3)
  • packages/apps/kubernetes/README.md
  • packages/apps/kubernetes/values.yaml
  • packages/apps/kubernetes/values.schema.json
🚧 Files skipped from review as they are similar to previous changes (3)
  • packages/apps/kubernetes/templates/csi/driver-config.yaml
  • packages/system/kubevirt-csi-node/values.yaml
  • hack/e2e-apps/run-kubernetes.sh

Comment on lines +31 to +41
{{- $storageClasses := dict }}
{{- range $sc := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
{{- $isDefault := false }}
{{- if $sc.metadata.annotations }}
{{- $isDefault = eq (index $sc.metadata.annotations "storageclass.kubernetes.io/is-default-class" | default "false") "true" }}
{{- end }}
{{- $_ := set $storageClasses $sc.metadata.name (dict "infraStorageClass" $sc.metadata.name "default" $isDefault) }}
{{- end }}
{{- if not $storageClasses }}
{{- $_ := set $storageClasses "replicated" (dict "infraStorageClass" "replicated" "default" true) }}
{{- end }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add a kubevirt compatibility alias to avoid post-upgrade PVC creation failures.

Line 31–Line 41 only propagates discovered infra names. Existing immutable volumeClaimTemplates.spec.storageClassName: kubevirt can still exist; once kubevirt disappears, later scale-out/recreate flows can fail to provision PVCs.

🔧 Proposed compatibility shim
-    {{- $storageClasses := dict }}
+    {{- $storageClasses := dict }}
+    {{- $defaultInfra := "" }}
     {{- range $sc := (lookup "storage.k8s.io/v1" "StorageClass" "" "").items }}
     {{- $isDefault := false }}
     {{- if $sc.metadata.annotations }}
     {{- $isDefault = eq (index $sc.metadata.annotations "storageclass.kubernetes.io/is-default-class" | default "false") "true" }}
     {{- end }}
+    {{- if and $isDefault (eq $defaultInfra "") }}
+    {{- $defaultInfra = $sc.metadata.name }}
+    {{- end }}
     {{- $_ := set $storageClasses $sc.metadata.name (dict "infraStorageClass" $sc.metadata.name "default" $isDefault) }}
     {{- end }}
     {{- if not $storageClasses }}
     {{- $_ := set $storageClasses "replicated" (dict "infraStorageClass" "replicated" "default" true) }}
     {{- end }}
+    {{- if not (hasKey $storageClasses "kubevirt") }}
+    {{- $_ := set $storageClasses "kubevirt" (dict "infraStorageClass" ($defaultInfra | default "replicated") "default" false) }}
+    {{- end }}
     {{- $storageClasses | toYaml | nindent 6 }}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/apps/kubernetes/templates/helmreleases/csi.yaml` around lines 31 -
41, The template currently builds $storageClasses from discovered StorageClass
names but doesn't add a compatibility alias for "kubevirt", which can break
provisioning when existing volumeClaimTemplates.spec.storageClassName is set to
kubevirt; update the generation logic around $storageClasses (where lookup
"storage.k8s.io/v1" "StorageClass" is iterated and $_ := set $storageClasses is
called) to ensure you always set a "kubevirt" key mapping to an appropriate
infraStorageClass (e.g., point "kubevirt" to an existing infraStorageClass like
"replicated" or the discovered default) so lookups for kubevirt succeed during
scale-out/recreate flows.

Comment on lines +42 to +45
VALUES_SECRET=$(kubectl -n "$NAMESPACE" get hr "$HR_NAME" -o json | \
jq -r '.spec.valuesFrom // [] | map(select(.kind == "Secret" and (.name | test("cozystack-values") | not))) | .[0].name // ""')
VALUES_KEY=$(kubectl -n "$NAMESPACE" get hr "$HR_NAME" -o json | \
jq -r '.spec.valuesFrom // [] | map(select(.kind == "Secret" and (.name | test("cozystack-values") | not))) | .[0].valuesKey // "values.yaml"')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Locate and examine the migration script
find . -path "*/packages/core/platform/images/migrations/migrations/35" -type f

Repository: cozystack/cozystack

Length of output: 119


🏁 Script executed:

# Check if the file exists and examine its content
if [ -f "packages/core/platform/images/migrations/migrations/35" ]; then
  wc -l "packages/core/platform/images/migrations/migrations/35"
  echo "---"
  cat -n "packages/core/platform/images/migrations/migrations/35"
else
  echo "File not found at expected path"
  # Try to find it with a broader search
  find . -name "35" -type f 2>/dev/null | head -20
fi

Repository: cozystack/cozystack

Length of output: 10642


🏁 Script executed:

# Search for any references to valuesFrom with cozystack-values in YAML templates
rg "valuesFrom" --type yaml -B 3 -A 3 | head -100

Repository: cozystack/cozystack

Length of output: 6490


🏁 Script executed:

# Check for patterns of secret naming in HelmRelease specs
rg "cozystack-values" --type yaml -B 2 -A 2

Repository: cozystack/cozystack

Length of output: 5481


Fix jsonpath bracket notation for dotted secret keys.

The selector logic (lines 43, 45) is correct—it intentionally excludes cozystack-values to find custom values secrets. However, lines 61 and 64 fail to read from the default key values.yaml because the jsonpath syntax breaks for dotted keys. Use bracket notation {.data['${VALUES_KEY}']} instead of {.data.${VALUES_KEY}} to correctly access keys with dots. This prevents the migration from silently skipping all Kubernetes instances.

🔧 Fix: Use bracket notation for dotted keys
  # --- Decode current values ---
  VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
-   -o jsonpath="{.data.${VALUES_KEY}}" 2>/dev/null | base64 -d 2>/dev/null || true)
+   -o jsonpath="{.data['${VALUES_KEY}']}" 2>/dev/null | base64 -d 2>/dev/null || true)
  if [ -z "$VALUES_YAML" ]; then
    VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
-     -o jsonpath="{.stringData.${VALUES_KEY}}" 2>/dev/null || true)
+     -o jsonpath="{.stringData['${VALUES_KEY}']}" 2>/dev/null || true)
  fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/platform/images/migrations/migrations/35` around lines 42 - 45,
The jsonpath used later to read secret data fails for keys that contain dots;
update any kubectl jsonpath expressions that reference secret data using the
VALUES_KEY variable (looks for patterns like {.data.${VALUES_KEY}}) to use
bracket notation instead (e.g. {.data['${VALUES_KEY}']}) so dotted keys (for
example "values.yaml" or other dotted secret keys) are accessed correctly;
change the jsonpath usages that consume VALUES_KEY and ensure the
VALUES_SECRET/VALUES_KEY variable names remain unchanged.

Comment on lines +60 to +64
VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
-o jsonpath="{.data.${VALUES_KEY}}" 2>/dev/null | base64 -d 2>/dev/null || true)
if [ -z "$VALUES_YAML" ]; then
VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
-o jsonpath="{.stringData.${VALUES_KEY}}" 2>/dev/null || true)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's check if the file exists and examine the actual code
find . -type f -path "*packages/core/platform/images/migrations/migrations/35*"

Repository: cozystack/cozystack

Length of output: 119


🏁 Script executed:

# Also search for migration files in that general area
fd -t f "35" packages/core/platform/images/migrations/ 2>/dev/null | head -20

Repository: cozystack/cozystack

Length of output: 117


🏁 Script executed:

# Broader search for migration-related files
find packages/core -type f -name "35*" 2>/dev/null

Repository: cozystack/cozystack

Length of output: 117


🏁 Script executed:

# Check the structure of the migrations directory
ls -la packages/core/platform/images/migrations/ 2>/dev/null || echo "Directory not found at expected path"

Repository: cozystack/cozystack

Length of output: 360


🏁 Script executed:

# Look for any migration files with the specific code pattern
rg -l "VALUES_YAML.*\-o jsonpath" packages/core/ 2>/dev/null

Repository: cozystack/cozystack

Length of output: 117


🏁 Script executed:

# Read the migration file 35
cat -n packages/core/platform/images/migrations/migrations/35

Repository: cozystack/cozystack

Length of output: 10579


🏁 Script executed:

# Also check what VALUES_KEY is set to (if at all)
grep -n "VALUES_KEY" packages/core/platform/images/migrations/migrations/35

Repository: cozystack/cozystack

Length of output: 438


🏁 Script executed:

# Test JSONPath with dotted keys - verify the issue exists
# Create a temporary test to show the difference

# Using kubectl dry-run to avoid cluster write
echo "=== Test 1: Unescaped dotted key (incorrect) ==="
kubectl create secret generic test-secret --from-literal='values.yaml=testdata123' --dry-run=client -o json | \
  jq -r '.data | keys'

echo ""
echo "=== Test 2: JSONPath with unescaped dot ==="
kubectl create secret generic test-secret --from-literal='values.yaml=testdata123' --dry-run=client \
  -o jsonpath='{.data.values.yaml}' 2>/dev/null || echo "(returns empty or error)"

echo ""
echo "=== Test 3: JSONPath with escaped dot ==="
kubectl create secret generic test-secret --from-literal='values.yaml=testdata123' --dry-run=client \
  -o jsonpath='{.data.values\.yaml}' 2>/dev/null

echo ""
echo "=== Test 4: Safe method using jq ==="
kubectl create secret generic test-secret --from-literal='values.yaml=testdata123' --dry-run=client -o json | \
  jq -r --arg key 'values.yaml' '.data[$key]'

Repository: cozystack/cozystack

Length of output: 372


🏁 Script executed:

# Let me verify the JSONPath behavior claim by checking Kubernetes documentation
# and examining if there are any tests in the repo related to secret key handling

# Search for other uses of VALUES_KEY or dotted keys in secrets in the codebase
rg -n "values\.yaml|VALUES_KEY" packages/core/platform/images/migrations/ -A 2 -B 2

Repository: cozystack/cozystack

Length of output: 8835


🏁 Script executed:

# Check if there are other migration scripts that handle similar secret key reads
fd -t f "migrations" packages/core/platform/images/migrations/ -exec wc -l {} \; | sort -n

Repository: cozystack/cozystack

Length of output: 208


🏁 Script executed:

# Look for any tests or documentation about secret key escaping
fd -t f "\\.md|\\.sh" packages/core/ -exec grep -l "values\.yaml\|dotted.*key\|jsonpath.*escape" {} \;

Repository: cozystack/cozystack

Length of output: 300


🏁 Script executed:

# Check if there are other JSONPath patterns with dots in the codebase that might have similar issues
rg "jsonpath.*\$\{.*\}" packages/core/ -n | head -20

Repository: cozystack/cozystack

Length of output: 45


🏁 Script executed:

# Examine migration 30 to see if it has the same issue pattern
cat -n packages/core/platform/images/migrations/migrations/30 | sed -n '60,65p'

Repository: cozystack/cozystack

Length of output: 467


🏁 Script executed:

# Let me check if there's any documentation or comments in the migrations about this pattern
grep -B5 -A5 "jsonpath" packages/core/platform/images/migrations/migrations/35 | head -30

Repository: cozystack/cozystack

Length of output: 1172


🏁 Script executed:

# Check what kubernetes documentation says about JSONPath escaping by examining any test or validation logic
rg "escape|backslash|\\\\." packages/core/platform/images/migrations/ -n

Repository: cozystack/cozystack

Length of output: 7541


🏁 Script executed:

# Key finding: check if dotted keys are escaped elsewhere in the same script
grep -n "jsonpath.*\\\\\." packages/core/platform/images/migrations/migrations/35

Repository: cozystack/cozystack

Length of output: 308


🏁 Script executed:

# Check all the valuesKey defaults across migrations to confirm they use dotted keys
rg "valuesKey.*default|valuesKey.*values\\.yaml" packages/core/platform/images/migrations/ -n

Repository: cozystack/cozystack

Length of output: 857


Fix unsafe JSONPath key access for dotted secret keys.

Lines 60-61 and 63-64 use JSONPath {.data.${VALUES_KEY}} with VALUES_KEY defaulting to values.yaml. In JSONPath, dotted keys must be escaped; unescaped dots are interpreted as nested object access. This causes the reads to fail silently, returning empty and triggering [SKIP] Could not read values from secret, skipping the migration entirely.

The script already demonstrates awareness of this at line 120 with escaped dots (super-admin\.svc), and already uses the safe pattern for writes (lines 88-91) with jq --arg key "$VALUES_KEY". Apply the same safe approach to reads:

🔧 Proposed fix
-  VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
-    -o jsonpath="{.data.${VALUES_KEY}}" 2>/dev/null | base64 -d 2>/dev/null || true)
+  SECRET_JSON=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" -o json)
+  VALUES_B64=$(echo "$SECRET_JSON" | jq -r --arg key "$VALUES_KEY" '.data[$key] // ""')
+  VALUES_YAML=$(printf '%s' "$VALUES_B64" | base64 -d 2>/dev/null || true)
   if [ -z "$VALUES_YAML" ]; then
-    VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
-      -o jsonpath="{.stringData.${VALUES_KEY}}" 2>/dev/null || true)
+    VALUES_YAML=$(echo "$SECRET_JSON" | jq -r --arg key "$VALUES_KEY" '.stringData[$key] // ""')
   fi

Note: Migrations 29 and 30 contain the same vulnerable pattern.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
-o jsonpath="{.data.${VALUES_KEY}}" 2>/dev/null | base64 -d 2>/dev/null || true)
if [ -z "$VALUES_YAML" ]; then
VALUES_YAML=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" \
-o jsonpath="{.stringData.${VALUES_KEY}}" 2>/dev/null || true)
SECRET_JSON=$(kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" -o json)
VALUES_B64=$(echo "$SECRET_JSON" | jq -r --arg key "$VALUES_KEY" '.data[$key] // ""')
VALUES_YAML=$(printf '%s' "$VALUES_B64" | base64 -d 2>/dev/null || true)
if [ -z "$VALUES_YAML" ]; then
VALUES_YAML=$(echo "$SECRET_JSON" | jq -r --arg key "$VALUES_KEY" '.stringData[$key] // ""')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/core/platform/images/migrations/migrations/35` around lines 60 - 64,
The JSONPath reads using kubectl with "{.data.${VALUES_KEY}}" are unsafe for
dotted secret keys (VALUES_KEY like "values.yaml"); replace those jsonpath reads
with a safe JSON read via kubectl -o json piped into jq using an index lookup
with --arg: call kubectl -n "$NAMESPACE" get secret "$VALUES_SECRET" -o json |
jq -r --arg key "$VALUES_KEY" '.data[$key] // .stringData[$key]' and then
base64-decode the result into VALUES_YAML (same approach used for writes),
updating the code paths that set VALUES_YAML (the kubectl/jsonpath lines that
reference VALUES_KEY, VALUES_SECRET, NAMESPACE) so dotted keys are handled
correctly; apply the same fix to migrations 29 and 30.

Replace the hardcoded 'kubevirt' StorageClass in tenant clusters with
auto-discovered StorageClasses from the infrastructure cluster.

Migration 35 handles the upgrade path:
- STEP 1: Discover Kubernetes HelmReleases
- STEP 2: Remove deprecated 'storageClass' from values Secrets
- STEP 3: PVC rebinding in tenants (kubevirt → infra SC name)

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Signed-off-by: mattia-eleuteri <[email protected]>
@kvaps kvaps force-pushed the feat/auto-propagate-storageclasses branch from 7e19133 to 416742f Compare March 10, 2026 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[kubernetes] Auto-propagate infrastructure StorageClasses to tenant clusters

2 participants