Cache image-pull-secret reads in webhook via controller-runtime informer#7353
Merged
Conversation
The embedded secret manager's mutating webhook issues two synchronous kube-apiserver Gets per pod admission to resolve image pull secrets (one against the reference namespace, one against the pod namespace for the mirrored copy). At admission scale this is unnecessary load on the API server. Replace the bare controller-runtime client.New with a cache-backed delegating client. The cache is started for the lifetime of the webhook and watches Secrets cluster-wide; reads are served from the informer, writes still go through the API server. Signed-off-by: Haytham Abuelfutuh <[email protected]>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR reduces kube-apiserver load from the embedded secret manager webhook by switching its controller-runtime client from direct API reads to cache-backed reads via a controller-runtime informer cache, primarily targeting repeated Secret Get calls during pod admission when image-pull-secret injection is enabled.
Changes:
- Create and start a controller-runtime informer cache for Secrets during webhook initialization.
- Build a controller-runtime client configured to use the informer cache for reads (while continuing to send writes directly to the API server).
- Block webhook startup until the cache reports it has synced.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+73
to
+80
| go func() { | ||
| if err := secretInformerCache.Start(ctx); err != nil { | ||
| logger.Errorf(ctx, "secret informer cache stopped: %v", err) | ||
| } | ||
| }() | ||
| if !secretInformerCache.WaitForCacheSync(ctx) { | ||
| return nil, fmt.Errorf("secret informer cache failed to sync") | ||
| } |
Comment on lines
+66
to
87
| secretInformerCache, err := ctrlcache.New(kubeConfig, ctrlcache.Options{ | ||
| Scheme: ctrlRuntimeScheme, | ||
| }) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to create informer cache: %w", err) | ||
| } | ||
|
|
||
| go func() { | ||
| if err := secretInformerCache.Start(ctx); err != nil { | ||
| logger.Errorf(ctx, "secret informer cache stopped: %v", err) | ||
| } | ||
| }() | ||
| if !secretInformerCache.WaitForCacheSync(ctx) { | ||
| return nil, fmt.Errorf("secret informer cache failed to sync") | ||
| } | ||
|
|
||
| ctrlRuntimeClient, err := client.New(kubeConfig, client.Options{ | ||
| Scheme: ctrlRuntimeScheme, | ||
| Cache: &client.CacheOptions{ | ||
| Reader: secretInformerCache, | ||
| }, | ||
| }) |
Comment on lines
+66
to
+68
| secretInformerCache, err := ctrlcache.New(kubeConfig, ctrlcache.Options{ | ||
| Scheme: ctrlRuntimeScheme, | ||
| }) |
pingsutw
approved these changes
May 7, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Tracking issue
Why are the changes needed?
The embedded secret manager's mutating webhook (
flyteplugins/go/tasks/pluginmachinery/secret/embedded_secret_manager.go) issues two synchronouskube-apiserverGetcalls per pod admission to resolve image pull secrets whenEmbeddedSecretManagerConfig.ImagePullSecrets.Enabledis set:findAndAddImagePullSecretreads the referencecorev1.Secretfrom the configured reference namespace.addImagePullSecretToPodreads the mirroredcorev1.Secretfrom the pod's namespace (and creates it on first sight).The controller-runtime client used for these reads is constructed via bare
client.New(...), which is a direct (non-cached) client. Every admission round-trips to the API server, which is unnecessary load at scale (high pod churn across many namespaces).What changes were proposed in this pull request?
Replace the bare
client.Newinflyteplugins/go/tasks/pluginmachinery/secret/secrets_injector.gowith a cache-backed delegating client:controller-runtimeinformer cache is created and started for the lifetime of the webhook.Createof the mirrored secret) still go through the API server.WaitForCacheSyncblocks startup until the informer is ready, so the first admission isn't a guaranteed cache miss.The existing in-process
SecretValuecache (stdlibCache) is unrelated and is left unchanged — it caches values pulled from external secret managers (AWS / GCP / Azure / Vault), not the k8sSecretobjects that back the image pull.How was this patch tested?
go build ./flyteplugins/go/tasks/pluginmachinery/secret/... ./executor/...— clean.go vet ./flyteplugins/go/tasks/pluginmachinery/secret/... ./executor/...— clean.go test ./flyteplugins/go/tasks/pluginmachinery/secret/... -count=1— all pass.Manual verification of the runtime path (informer behavior under list/watch on Secrets,
WaitForCacheSyncblocking startup, informer-driven invalidation when reference secrets change) is recommended in a staging cluster before promoting.Labels
Check all the applicable boxes
Related PRs
Companion PR in the v1 fork: unionai/flyte#969