Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm upgrade failed for release, <HELM CHART> has no deployed releases #4614

Open
1 task done
Nathan-Nesbitt opened this issue Feb 17, 2024 · 6 comments
Open
1 task done

Comments

@Nathan-Nesbitt
Copy link

Nathan-Nesbitt commented Feb 17, 2024

Describe the bug

Here is the config file that I have for creating the rook-ceph-cluster as defined here. I can deploy using almost the same layout for the rook-ceph helm chart but the rook-ceph-cluster chart fails.

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: rook-ceph-cluster
  namespace: rook-ceph
spec:
  chart:
    spec:
      chart: rook-ceph-cluster
      version: 1.13.x
      sourceRef:
        kind: HelmRepository
        name: rook-ceph
        namespace: flux-system
  interval: 30m
  timeout: 10m
  install:
    remediation:
      retries: 3
  upgrade:
    remediation:
      retries: -1
    crds: CreateReplace
  releaseName: rook-ceph-cluster
  values:
    # ALL THE VALUES FROM values.yaml

This is the helm repository file:

apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: HelmRepository
metadata:
  name: rook-ceph
  namespace: flux-system
spec:
  interval: 15m
  url: https://charts.rook.io/release

When I go to deploy, I get the following error, which I cannot for the life of me figure out why.

Helm upgrade failed for release rook-ceph/rook-ceph-cluster with chart [email protected]: "rook-ceph-cluster" has no deployed releases

I can create it manually from the cli using helm by running the following:

helm repo add rook https://charts.rook.io/release
helm install my-rook-ceph-cluster rook/rook-ceph-cluster --version 1.13.4

So it's not an issue with the helm repository, or with rook.

Any ideas?

Steps to reproduce

  1. Create a K8s cluster
  2. Create a setup based on the example git repo
  3. Try to install rook-ceph using the above config (1/2 files, there is a lot of setup for this, but the other files do not cause an error)

Expected behavior

It should just run as expected 🤷

Screenshots and recordings

No response

OS / Distro

Ubuntu -- Jammy

Flux version

flux version 2.2.3

Flux check

► checking prerequisites
✔ Kubernetes 1.27.10+rke2r1 >=1.26.0-0
► checking version in cluster
✔ distribution: flux-v2.2.3
✔ bootstrapped: true
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.37.4
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v1.2.2
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v1.2.4
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v1.2.4
► checking crds
✔ alerts.notification.toolkit.fluxcd.io/v1beta3
✔ buckets.source.toolkit.fluxcd.io/v1beta2
✔ gitrepositories.source.toolkit.fluxcd.io/v1
✔ helmcharts.source.toolkit.fluxcd.io/v1beta2
✔ helmreleases.helm.toolkit.fluxcd.io/v2beta2
✔ helmrepositories.source.toolkit.fluxcd.io/v1beta2
✔ kustomizations.kustomize.toolkit.fluxcd.io/v1
✔ ocirepositories.source.toolkit.fluxcd.io/v1beta2
✔ providers.notification.toolkit.fluxcd.io/v1beta3
✔ receivers.notification.toolkit.fluxcd.io/v1
✔ all checks passed

Git provider

No response

Container Registry provider

No response

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@Nathan-Nesbitt
Copy link
Author

Nathan-Nesbitt commented Feb 17, 2024

The only thing I found was I was having similar issues with another helm repo, and the only way I could fix it was to nuke the entire flux bootstrap and restart. Saying that, that's not really practical for anything beyond setup... is there some other way to force reset helm (if that is actually the issue?)

This hasn't worked for this chart, it has been stuck like this for a while now.

@98jan
Copy link

98jan commented Feb 17, 2024

Hi,

what I can see is that the HelmRepository manifest is named "rook-ceph", but in the HelmRelease the referenced HelmRepository name ist "rook".

What helps me often is to run following command:
kubectl describe HelmRelease rook-ceph-cluster -n rook-ceph

@Nathan-Nesbitt
Copy link
Author

@98jan thanks for the reply, I must have changed that last night while playing around, unfortunately it's still stuck with the error.

@Nathan-Nesbitt
Copy link
Author

Nathan-Nesbitt commented Feb 17, 2024

The interesting part to me is that, the error has to do with releases not the chart itself, which to me looks like the release is the issue as it finds v1.13.14 in the helm repository which is not specified.

To me this looks like the release name is somehow wrong/non-existent? Even tho it is specified and should be the same as the name of the HelmRelease from above?

I think this has to do with the application of the values to the releaseName.

What is also interesting is any time I try to force a reconcile it results in the following:

► annotating HelmRelease rook-ceph-cluster in rook-ceph namespace
✔ HelmRelease annotated
◎ waiting for HelmRelease reconciliation
✗ context deadline exceeded

@Nathan-Nesbitt
Copy link
Author

Yeah there's something very unusual going on, nuked the node and recreated everything and it's now working:

True    Helm install succeeded for release rook-ceph/rook-ceph-cluster.v1 with chart [email protected]

No change to the charts or anything, just started working after I deleted everything...

Is there any general advice on how to debug issues like this? Seems like general flakeyness but it's not clear why this worked or even what the actual original issue was.

@anakineo
Copy link

anakineo commented Apr 7, 2024

Not exactly the same situation. I had the same error while porting release that was installed by Helm Install to use flux to mange.

The fix was to delete helm release secret

I tried to look into helm-controller code base, the path I found that could lead to this error is when helm-controller does upgrade but the releaseName doesnt exist

  1. for every recocile, helm-controller tries to determine the state of the current release then calculate the next action for that state
  2. In my case, I think one of helm-controller's ReleaseStatusOutOfSync ReleaseStatusUnmanaged, ReleaseStatusFailed state was returned. at least the time when helm-controller checked the status of helmrelase
  3. based on the returned state, an Upgrade action was returned
	case ReleaseStatusUnmanaged:
		log.Info(msgWithReason("release not managed by controller", state.Reason))

		// Clear the history as we can no longer rely on it.
		req.Object.Status.ClearHistory()

		return NewUpgrade(r.configFactory, r.eventRecorder), nil
  1. Ugrade then goes on calling action.Upgrade which is an wrapper around helm upgrade action
func Upgrade(ctx context.Context, config *helmaction.Configuration, obj *v2.HelmRelease, chrt *helmchart.Chart,
	vals helmchartutil.Values, opts ...UpgradeOption) (*helmrelease.Release, error) {
	upgrade := newUpgrade(config, obj, opts)

	policy, err := crdPolicyOrDefault(obj.GetUpgrade().CRDs)
	if err != nil {
		return nil, err
	}
	if err := applyCRDs(config, policy, chrt, setOriginVisitor(v2.GroupVersion.Group, obj.Namespace, obj.Name)); err != nil {
		return nil, fmt.Errorf("failed to apply CustomResourceDefinitions: %w", err)
	}

	return upgrade.RunWithContext(ctx, release.ShortenName(obj.GetReleaseName()), chrt, vals.AsMap())
}
  1. inside helm upgrade, it does a pre-upgrade check where if no realse with the name found, "has no deployed releases" error is returned
	// finds the last non-deleted release with the given name
	lastRelease, err := u.cfg.Releases.Last(name)
	if err != nil {
		// to keep existing behavior of returning the "%q has no deployed releases" error when an existing release does not exist
		if errors.Is(err, driver.ErrReleaseNotFound) {
			return nil, nil, driver.NewErrNoDeployedReleases(name)
		}
		return nil, nil, err
	}

This seems unlikely to happen as helm-controller did check the state of the helmrelease when determing the state. I don't know helm package enough to tell but I guess helm state was changed in between state was determined by flux and helm check the state to perform upgrade so that the release no longer exist

Deleting the helm secret worked as helm-controller would determine the release not installed and performs an install instead of upgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants