feat(chart): Ensure helper secret stability across upgrades#313
feat(chart): Ensure helper secret stability across upgrades#313tidusete wants to merge 1 commit intoSentinel-One:masterfrom
Conversation
Use lookup() to check whether the helper TLS certificate secret and the helper server-token secret already exist in the cluster before rendering new values. If they do, their existing data is reused verbatim so that helm upgrade, Flux reconciliations and Terraform applies no longer rotate certificates or tokens on every run. On a first install the secrets are generated as before. The helm.sh/resource-policy: keep annotation prevents accidental deletion on helm uninstall. The same lookup-based preservation is applied to the helper secret in webhookconfiguration.yaml (webhooks path), including the caBundle used by the MutatingWebhookConfiguration and ValidatingWebhookConfiguration. For ArgoCD deployments, lookup() returns nil during helm-template rendering so certs are still regenerated each sync. The recommended mitigation is to add ignoreDifferences for /data on both secrets in the ArgoCD Application spec.
|
@tidusete thanks for the PR. |
|
Hey @oded-s1, I understand the intent behind regenerating certs on upgrade, but the problem in practice is that this happens on every helm upgrade not just cert-related ones. Because Helm applies the new secret and webhook configuration before the StatefulSet and DaemonSet rollouts complete, there is always a window where part of the pods are serving with the old certificate and part with the new one. This is an avoidable source of errors that makes troubleshooting harder, not easier. I think this is also the same root issue that #242 tried to address. From a GitOps and Helm maintainability perspective, secrets should be stable and changes should be explicit and intentional. What this PR achieves is exactly that: the certificate is preserved across upgrades, so routine changes don't cause unnecessary disruption. When you actually need to rotate (whether for expiry or any other reason) you delete the secret and it will be regenerated on the next helm upgrade or ArgoCD sync. The rotation becomes a deliberate, traceable operation rather than a side effect of every deployment. |
Description
This PR addresses an idempotency issue where the helper's TLS certificate (
sentinelone-helper) and server-token (sentinelone-helper-token) secrets were regenerated on everyhelm upgrade, Flux reconciliation, orterraform apply.Problem
Constant rotation of these secrets causes several operational issues:
Solution
This change introduces the use of the Helm
lookup()function to check if the helper secrets already exist in the cluster before rendering them.The
helm.sh/resource-policy: keepannotation remains on the secrets to prevent accidental deletion during ahelm uninstall, preserving their state for future installations if desired.ArgoCD Considerations
Due to the timing of how ArgoCD renders templates,
lookup()may not find the secret during the sync planning phase and will attempt to regenerate it. The recommended workaround for ArgoCD users is to addignoreDifferencesrules for the/datafield on both secrets within theApplicationresource specification.Example ArgoCD
ignoreDifferences: