Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add update reconciliation for migration job #45

Merged
merged 11 commits into from
Aug 29, 2024

Conversation

parkedwards
Copy link
Contributor

@parkedwards parkedwards commented Aug 27, 2024

resolves #31

migration Jobs will be handled differently in that anytime a new Job spec is generated, we'll create a new, distinct Job resource. This append-only operation allows us to not worry about Deletes, as Job resources (usually) need to be replaced if there are updates to its spec

@parkedwards parkedwards force-pushed the feat/pg-db-migrations branch from 02e50f9 to 80a528b Compare August 27, 2024 01:34
Comment on lines 211 to 212
// deletes the Job, so it can be recreated on the following reconciliation.
// k8s Jobs are immutable and need to be recreated if they are to be re-run.
if foundMigrationJob.GetDeletionTimestamp() == nil {
log.Info("Deleting existing migration Job", "name", foundMigrationJob.Name)
if err = r.Delete(ctx, foundMigrationJob); err != nil {
condition = conditions.NotDeleted(objName, err)
return &ctrl.Result{}, err
}
condition = conditions.Deleted(objName)
} else {
log.Info("Waiting for migration Job to be deleted, so it can be recreated", "name", desiredMigrationJob.Name)
}
return &ctrl.Result{Requeue: true}, nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no "waiting" in the operator pattern, so we have to make this reconciliation handler idempotent. so:

  • if we detect that an update (actually, a replacement) is needed
  • we'll take the existing Job resource, see if it's been marked for deletion -> if not, delete it
  • we can't be sure of how many reconciliations it will take, so we'll assume it takes more than 1
  • we'll continue to requeue the reconciliation job until the foundMigrationJob is cleaned up, so that the very next run will create a new Job

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parkedwards parkedwards changed the title chore: add update reconciliation for migration job feat: add update reconciliation for migration job Aug 27, 2024
@parkedwards parkedwards force-pushed the feat/pg-db-migrations branch from 2eccf47 to b53bbaf Compare August 28, 2024 23:42
@parkedwards parkedwards marked this pull request as ready for review August 28, 2024 23:48
Comment on lines +1571 to +1580
_, _, desiredMigrationJob = controllerReconciler.prefectServerDeployment(prefectserver)
migrateJob = &batchv1.Job{}
Eventually(func() error {
return k8sClient.Get(ctx, types.NamespacedName{
Namespace: namespaceName,
Name: desiredMigrationJob.Name,
}, migrateJob)
}).Should(Succeed())
migrateJob.Status.Succeeded = 1
Expect(k8sClient.Status().Update(ctx, migrateJob)).To(Succeed())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind of annoying, but looking up a Job by name means i have to re-run the manifest generator every time to create the hashed name, which is based on the manifest

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I remember thinking about using .metadata.generation from the CR since that should be unique and change each time the CR changes. But not sure if there are any drawbacks to that approach.

Comment on lines +61 to +65
job.Spec.Template.Spec.Containers[0].Image = "image1:v2"
hash2, err := Hash(job.Spec, 8)
Expect(err).NotTo(HaveOccurred())

Expect(hash1).NotTo(Equal(hash2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noice

return &batchv1.Job{
ObjectMeta: metav1.ObjectMeta{
Namespace: server.Namespace,
Name: fmt.Sprintf("%s-migration-%s", server.Name, hashSuffix),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may eventually want to trim this name (and the name of all resources, I guess) to 63 characters in case the server.Name provided is unusually long for some reason. Or we can let it fail and we'll know why - just calling it out as a familiar speedbump.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call out. might make sense to validate a char limit for the .Name on the CR
#56

case !isMigrationJobFinished(foundMigrationJob):
log.Info("Waiting on active migration Job to complete", "name", foundMigrationJob.Name)
condition = conditions.AlreadyExists(objName, fmt.Sprintf("migration Job %s is still active", foundMigrationJob.Name))
return &ctrl.Result{Requeue: true}, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From memory, this will immediately requeue the reconcile loop. If we find that it ends up looping a ton while waiting for these to finish (I'm not sure how long they usually take), we can pass another field like this:

return &ctrl.Result{Requeue,: true, RequeueAfter: 60 * time.Seconds}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah - good callout. it's probably a good idea to add a time buffer in so it doesn't spam the loop

Copy link
Contributor

@mitchnielsen mitchnielsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! looks good

Copy link
Collaborator

@chrisguidry chrisguidry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dig it! This will get even cleaner when we give it the CreateOrUpdate treatment in a future PR

@parkedwards parkedwards merged commit 221c6bc into main Aug 29, 2024
2 checks passed
@parkedwards parkedwards deleted the feat/pg-db-migrations branch August 29, 2024 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle the migration job for PostgreSQL-backed servers
3 participants