You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What are you trying to accomplish with this PR?
Currently if there is some error applying a manifest happening at the kubectl apply call, this error will get swallowed by krane. It is reported that an error occurred, but not what the error is.
This behaviour is making it unfeasible to debug underlying issues, for example permission issues on the (service) account used to run krane in a CD environment.
Not helpful, as even if a user would have access to the system/environment to manually replicate the kubectl apply call, krane already cleaned up the temp dir.
With this change:
[INFO][2020-04-20 15:44:40 +0000][staging][someapp-staging] ------------------------------------------Result: FAILURE-------------------------------------------
[FATAL][2020-04-20 15:44:40 +0000][staging][someapp-staging] Command failed: apply -f /tmp/d20200420-757-1v2j9wa --prune --all --prune-whitelist\=/v1/configmap --prune-whitelist\=/v1/endpoints --prune-whitelist\=/v1/event --prune-whitelist\=/v1/limitrange --prune-whitelist\=/v1/persistentvolumeclaim --prune-whitelist\=/v1/pod --prune-whitelist\=/v1/podtemplate --prune-whitelist\=/v1/replicationcontroller --prune-whitelist\=/v1/resourcequota --prune-whitelist\=/v1/secret --prune-whitelist\=/v1/serviceaccount --prune-whitelist\=/v1/service --prune-whitelist\=apps/v1/daemonset --prune-whitelist\=apps/v1/deployment --prune-whitelist\=apps/v1/replicaset --prune-whitelist\=apps/v1/statefulset --prune-whitelist\=autoscaling/v2beta1/horizontalpodautoscaler --prune-whitelist\=batch/v1beta1/cronjob --prune-whitelist\=batch/v1/job --prune-whitelist\=certmanager.k8s.io/v1alpha1/certificaterequest --prune-whitelist\=certmanager.k8s.io/v1alpha1/certificate --prune-whitelist\=certmanager.k8s.io/v1alpha1/challenge --prune-whitelist\=certmanager.k8s.io/v1alpha1/issuer --prune-whitelist\=certmanager.k8s.io/v1alpha1/order --prune-whitelist\=cloud.google.com/v1beta1/backendconfig --prune-whitelist\=crd.projectcalico.org/v1/networkpolicy --prune-whitelist\=extensions/v1beta1/ingress --prune-whitelist\=internal.autoscaling.k8s.io/v1alpha1/capacityrequest --prune-whitelist\=networking.gke.io/v1beta1/managedcertificate --prune-whitelist\=nodemanagement.gke.io/v1alpha1/updateinfo --prune-whitelist\=policy/v1beta1/poddisruptionbudget --prune-whitelist\=rbac.authorization.k8s.io/v1/rolebinding --prune-whitelist\=rbac.authorization.k8s.io/v1/role --prune-whitelist\=scalingpolicy.kope.io/v1alpha1/scalingpolicy:
error: error pruning namespaced object networking.gke.io/v1beta1, kind=managedcertificate: managedcertificates.networking.gke.io is forbidden: user "someserviceaccount@shopify-someapp-staging.iam.gserviceaccount.com" cannot list resource "managedcertificates" in api group "networking.gke.io" in the namespace "someapp-staging"
[FATAL][2020-04-20 15:44:40 +0000][staging][someapp-staging]
How is this accomplished?
Add the collected error to the message presented to the user.
What could go wrong?
Leaking sensitive information, maybe, somehow, eventually?
Error reporting is tricky in Krane, since we need to avoid the scenario where we leak secret information (this can happen if kubectl apply fails and dumps an unencrypted secret resource, for example). Krane tries to be smart about this, only suppressing output when there is a risk of exposing sensitive information, but there are fundamental difficulties that constrain our ability to guarantee correct behaviour. This logic exists in record_apply_failure (the line preceding your change). If you are not deploying any sensitive content, then the solution here is to figure out why output is being suppressed in the first place. Can you confirm whether or not output should be suppressed in your stated case? If not, we can explore that.
Thanks @timothysmith0609, this helps to clarify the situation.
While I agree that there's its a fundamental issue to filter for secrets from downstream kubectl invocations, I do not completely agree with the assumptions made.
Not potentially exposing secrets must be evaluated against blocking users from using krane at all. In this particular case, the only option left would be to ditch using krane, fall back to managing kubectl applys manually (shell script or whatever) again, including the ability to shoot yourself in the foot with leaking secrets. It is unusable as it is right now.
We should at least
Clearly state that additional error context is available but withheld for security reasons.
Offer a way to get that output or allow an administrator to re-run the exact same kubectl command manually, i.e. not wiping tmp dirs in error case.
Can you confirm whether or not output should be suppressed in your stated case? If not, we can explore that.
To be honest, I don't know. We do deploy a secrets.ejson along our resources file with krane. So I guess yes? With my understanding of how krane splits all resources into multiple steps, this particular step only reference to 'secret' would be in it's prune-whitelists.
I posted to complete and unmodified output above, it does not contain anything sensitive. So maybe not?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What are you trying to accomplish with this PR?
Currently if there is some error applying a manifest happening at the
kubectl applycall, this error will get swallowed by krane. It is reported that an error occurred, but not what the error is.This behaviour is making it unfeasible to debug underlying issues, for example permission issues on the (service) account used to run krane in a CD environment.
Example:
Not helpful, as even if a user would have access to the system/environment to manually replicate the
kubectl applycall, krane already cleaned up the temp dir.With this change:
How is this accomplished?
Add the collected error to the message presented to the user.
What could go wrong?
Leaking sensitive information, maybe, somehow, eventually?