Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e: adding testcases with recipe #1736

Draft
wants to merge 24 commits into
base: main
Choose a base branch
from
Draft

Conversation

asn1809
Copy link
Member

@asn1809 asn1809 commented Dec 30, 2024

Fixes #1698 partially which adds following cases:

  1. E2E with recipe but no hooks
  2. E2E with recipe and hooks

e2e/config.yaml Outdated Show resolved Hide resolved
Signed-off-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Annaraya Narasagond <[email protected]>
Copy link
Member

@nirs nirs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recipe is part of the workload, not a way to deploy the workload.

I think we can add options for workload to create a recipe with optional hook, and a way to select these workloads only for discoveredapps deployer. For example we can have IsDiscovered() for workload which will make it run only with discovered apps deployer.

}

log.Info("recipe created on both dr clusters")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be part of the workload, not of the deployer.

if err := deleteRecipe(util.Ctx.C2.Client, recipeName, appNamespace); err != nil {
return err
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, part of the workload.


return nil
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

functions for creating recipes should move to recipes package or maybe workloads since recipes are part of a workload.

appset = &deployers.ApplicationSet{}
discoveredApps = &deployers.DiscoveredApp{}
discoveredAppsWithoutHook = &deployers.DiscoveredApp{IncludeRecipe: true, IncludeHooks: false}
discoveredAppsWithHook = &deployers.DiscoveredApp{IncludeRecipe: true, IncludeHooks: true}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to keep discoveredApps deployer and add workloads including a recipe with/without hooks.

Also we don't want to add so many tests - this add 2 new deployers, which adds 4 new tests. Why do w need to test recipe without hook and recipe with hook with both rbd and ceph storage? Does recipe/hooks care about the storage which is handled by ramen anyway?

if err := recipe.AddToScheme(scheme); err != nil {
return err
}

return ramen.AddToScheme(scheme)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets split the part adding recipe support to another commit or PR. This is preparation for adding recipes.

@asn1809 asn1809 marked this pull request as draft January 22, 2025 17:42
raghavendra-talur and others added 20 commits January 22, 2025 23:14
and remove the config.yaml from the git repo. This way users don't have
to revert their changes to the config file before commiting changes.

Signed-off-by: Raghavendra Talur <[email protected]>
VolSync protection can now be enabled for any PVC type by adding a
specific annotation to the DRPC.

Signed-off-by: Benamar Mekhissi <[email protected]>
Previously, temporary PVCs for all storage types were being created with the
ReadOnlyMany access mode. This commit ensures that only CephFS temporary PVCs
will use the ReadOnlyMany access mode.

Additionally, the need for a special storage class tagged for backingSnapshot
has been removed. Instead, the same storage class used for ReadWriteMany will
now be used to create temporary ReadOnlyMany PVCs, simplifying the storage
configuration.

Signed-off-by: Benamar Mekhissi <[email protected]>
For non-CephFS PVCs, temporary PVCs now inherit the accessModes of the
source PVC rather than using ReadOnlyMany.

Signed-off-by: Benamar Mekhissi <[email protected]>
For consistency groups, temporary PVCs will now be created at the scheduled time
rather than immediately upon fresh deployment.

Signed-off-by: Benamar Mekhissi <[email protected]>
Signed-off-by: Benamar Mekhissi <[email protected]>
Co-Authored-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Co-Authored-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Co-Authored-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Found by Zed typos extension. Not sure when these were introduced, but
my editor started to complains about them now.

Signed-off-by: Nir Soffer <[email protected]>
- uncomplete -> incomplete
- plcement -> placement
- exptected -> expected

There were many instances of the same typos since lot of tests code is
duplicated.

Signed-off-by: Nir Soffer <[email protected]>
In processForDeletion, we cleanup the resources that are related to
volsync. It relies on v.volSyncPVCs but the list is empty when the vrg
is secondary because it is populated by looking at the
vrg.status.ProtectedPVCs.

For managed applications, the RD,RS and Snapshots were being as a
consequence of the deletion of VRG as the VRG is the owner for them.

In the case of discovered apps, the vrg is not the owner and therefore
the RD was being left behind.

Co-Authored-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Co-Authored-by: Annaraya Narasagond <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
if-return has been removed from the default list of revive for a valid
reason. See the argument in mgechev/revive#843.

Comparing two snippets

1.
```
if err := v.volSyncHandler.DeleteRS(name, namespace); err != nil {
	return err
}

if err := v.volSyncHandler.DeleteRD(name, namespace); err != nil {
	return err
}

if err := v.volSyncHandler.DeleteSnapshots(namespace); err != nil {
	return err
}

return nil
```

2.
```
if err := v.volSyncHandler.DeleteRS(name, namespace); err != nil {
	return err
}

if err := v.volSyncHandler.DeleteRD(name, namespace); err != nil {
	return err
}

return v.volSyncHandler.DeleteSnapshots(namespace)
```

1 is a lot easier to read compared to 2 but if-return throws an error
for it.

Signed-off-by: Raghavendra Talur <[email protected]>
Signed-off-by: Raghavendra Talur <[email protected]>
Add socket_vment instructions tailored for drenv, that can be copied
and pasted in the shell for easy installation.

- Recommend latest Lima version
- Recommend installing socket_vmnet from binary package instead of source
- Add copyable instructions for installing socket_vmnet binary package
- Replace the note about brew with note about installing the launchd
  service. The docs already warn about brew.

Signed-off-by: Nir Soffer <[email protected]>
Gathering typically takes less than 15 seconds, but we have seen one
case when it was stuck for 5 hours[1]. In this case the job was
terminated without archiving the logs and we have no way to debug the
issue.

Add a 3 minutes timeout to the gather job. If the command times out, we
kill it and archive what we got. This is likely to collect enough logs
to help debugging the original failure and the secondary gather failure
(using gather.log).

[1] https://github.com/RamenDR/ramen/actions/runs/12712182155/job/35437188040

Signed-off-by: Nir Soffer <[email protected]>
Generated using:

    % GOTOOLCHAIN=go1.22.9 go get github.com/csi-addons/[email protected]
    go: upgraded go 1.22.5 => 1.22.7
    go: upgraded toolchain go1.22.7 => go1.22.9
    go: upgraded github.com/csi-addons/kubernetes-csi-addons v0.10.1-0.20240924092040-c11db0b867a1 => v0.11.0
    go: upgraded google.golang.org/protobuf v1.35.1 => v1.35.2
    go: upgraded k8s.io/api v0.31.1 => v0.31.2
    go: upgraded k8s.io/apimachinery v0.31.1 => v0.31.2
    go: upgraded sigs.k8s.io/controller-runtime v0.19.0 => v0.19.1

    % go mod tidy
    go: downloading github.com/sagikazarmark/locafero v0.4.0
    go: downloading golang.org/x/tools v0.28.0
    go: downloading github.com/frankban/quicktest v1.14.6
    go: downloading github.com/google/pprof v0.0.0-20241210010833-40e02aabc2ad
    go: downloading github.com/sourcegraph/conc v0.3.0

Boris says we have now go 1.22.9 on the downstream build machines so we
can safely use it.

We need to add a GOTOOLCHAIN makefile variable later to make this
simpler for future updates.

Signed-off-by: Nir Soffer <[email protected]>
I copied the timeout_minutes which is correct for the retry action, but
the builtin timeout is timeout-minutes. The PR was merge too quickly
without waiting the CI.

Signed-off-by: Nir Soffer <[email protected]>
When using discovered apps, we check for volsync support in two places
in the ramenconfig.

- Under multiNamespace map, we look for volsyncSupported: true key value
pair
- Under volsync map, we need the key "disabled" to be set to false.

Ramenctl was configuring only the volsync map but not the multiNamespace
one.

We have to live with both the fields for now because of backward
compatibility reasons.

Signed-off-by: Raghavendra Talur <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

e2e: Add tests of check hooks for various resource types
3 participants