Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for Helm installs #11341

Open
aceeric opened this issue May 2, 2024 · 7 comments
Open

Better support for Helm installs #11341

aceeric opened this issue May 2, 2024 · 7 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@aceeric
Copy link

aceeric commented May 2, 2024

When an application is installed from a Helm chart, Helm creates a list of manifests from the chart and simply sends the manifests into the cluster - leaving it to Kubernetes to handle. The following scenario doesn't currently allow the chart to successfully deploy because of the Nginx admission webhook:

  1. Manifest A is a Job that creates a Secret in the cluster.
  2. Manifest B is an Ingress with annotation nginx.ingress.kubernetes.io/auth-tls-secret: secret-created-by-job.
  3. The Job manifest and the Ingress manifest go into the cluster at essentially the same time.
  4. The Nginx ingress admission webhook sees that the nginx.ingress.kubernmetes.io/auth-tls-secret annotation refs a secret that does not exist yet. And so the webhook rejects the ingress.
  5. The Helm chart then does not fully deploy (no ingresses) and so the app is unusable.
  6. A second later the Job completes and the secret is present.

I can think of a couple ways to address this:

  1. Introduce configurable (via cmdline) exponential backoff to the admission webhook to retry all ingresses
  2. Introduce an annotation to the ingress that causes the admission webhook to retry that specific ingress
  3. Introduce an annotation to the ingress that causes the admission webhook to not validate that specific ingress

If you think any of these approaches are valid I would be happy to look into submitting a PR. Thanks.

@aceeric aceeric added the kind/feature Categorizes issue or PR as related to a new feature. label May 2, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label May 2, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@longwuyuan
Copy link
Contributor

longwuyuan commented May 2, 2024

  • Wait for other comments
  • Automation that bundles this controller inside it, is out of scope of the controller code
  • But example docs for using the chart in different ways is welcome

One example of handling this use-case shows how to simplify the use instead of changing the controller code ;

echo "deploying prometheus.."
helm upgrade --install prometheusgrafana kube-prometheus-stack \
    --repo https://prometheus-community.github.io/helm-charts \
    --namespace observability \
    --create-namespace \
    --set prometheus.prometheusSpec.podMonitorSelectorNilUsesHelmValues=false \
    --set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false

echo "creating ingress-nginx ns"
kubectl create ns ingress-nginx

echo "creating default-ssl-certificate for ingress-controller"
kubectl -n ingress-nginx create secret tls wildcard.domain.com \
    --cert ~user/letsencrypt/domain.com/fullchain1.pem \
    --key ~user/letsencrypt/domain.com/privkey1.pem

echo "deploying ingress-nginx controller.."
helm upgrade --install ingress-nginx ingress-nginx \
    --repo https://kubernetes.github.io/ingress-nginx \
    --namespace ingress-nginx \
    --set controller.metrics.enabled=true \
    --set controller.metrics.serviceMonitor.enabled=true \
    --set controller.metrics.serviceMonitor.additionalLabels.release="prometheusgrafana" \
    --set controller.service.externalTrafficPolicy=Local \
    --set controller.extraArgs.default-ssl-certificate=ingress-nginx/wildcard.domain.com

echo "creating ingress for grafana.."
ingressnginxPodName=`kubectl -n ingress-nginx get po | grep -i ingress-nginx-controller | cut -f1 -d" "`
kubectl -n ingress-nginx wait --for=condition=Ready pod/$ingressnginxPodName --timeout 120s
kubectl -n observability create ingress grafana \
    --class nginx \
    --rule grafana.domain.com/"*"=prometheusgrafana:80,tls

This example shows not one but multiple charts integration, instead of trying to change the chart itself

@aceeric
Copy link
Author

aceeric commented May 2, 2024

Thanks - this doesn't address the original question which is - how to prevent the webhook from rejecting ingresses when the ingress references a secret created by a job and both the ingress AND the job are deployed in one helm chart.

Copy link

github-actions bot commented Jun 2, 2024

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

@github-actions github-actions bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jun 2, 2024
@bpsizemore
Copy link

Throwing support in for this issue - when bootstrapping new clusters this is a problem I come across frequently where a helm install leaves ingress-nginx in a bad state and prevents our gitops solution from properly bringing up clusters. Looking into manual alternatives now to solve the problem, but would prefer to be able to use the chart as intended.

@github-actions github-actions bot removed the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Jul 10, 2024
@aceeric
Copy link
Author

aceeric commented Aug 21, 2024

All - is there any way to elevate this? Thanks!

@longwuyuan
Copy link
Contributor

Current status is acute shortage of developer time resources and a pivot to improving security as well as focus on sticking closer to the KEP of the Kubernetes Ingress API specs.

I agree that having the kind of changes proposed in the description of this issue is extremely useful. But the instrumentation proposed in this issue is best suited as code that resides outside this project. While the changes would help the use case described, please do not expect any resources to be spent on making the kind of changes proposed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. needs-priority needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
Development

No branches or pull requests

4 participants