Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWX Upgrade to 24.2.0 failed with DB migration errors #2001

Open
3 tasks done
zpltn opened this issue Dec 17, 2024 · 0 comments
Open
3 tasks done

AWX Upgrade to 24.2.0 failed with DB migration errors #2001

zpltn opened this issue Dec 17, 2024 · 0 comments

Comments

@zpltn
Copy link

zpltn commented Dec 17, 2024

Please confirm the following

  • I agree to follow this project's code of conduct.
  • I have checked the current issues for duplicates.
  • I understand that the AWX Operator is open source software provided for free and that I might not receive a timely response.

Bug Summary

I have an AWX Server running v.23.6.0 (Operator Version 2.10.0), when I tried to upgrade to the Version 24.2.0 (Operator Version 2.15.0), it failed with DB migration errors, during the task "Stream backup from pg_dump to the new postgresql container", as some older job events or inventory update events contains duplicated key values:

`2024-12-06 12:55:55.040 UTC [28177] ERROR: duplicate key value violates unique constraint "main_inventoryupdateevent_20240520_03_pkey"
2024-12-06 12:55:55.040 UTC [28177] DETAIL: Key (id, job_created)=(291860769, 2024-05-20 03:30:19.047628+00) already exists.
2024-12-06 12:55:55.040 UTC [28177] CONTEXT: COPY main_inventoryupdateevent_20240520_03, line 1
2024-12-06 12:55:55.040 UTC [28177] STATEMENT: COPY public.main_inventoryupdateevent_20240520_03 (id, created, modified, event_data, uuid, counter, stdout, verbosity, start_line, end_line, inventory_update_id, job_created) FROM stdin;

2024-12-06 12:55:55.164 UTC [28177] ERROR: duplicate key value violates unique constraint "main_inventoryupdateevent_20240520_04_pkey"
2024-12-06 12:55:55.164 UTC [28177] DETAIL: Key (id, job_created)=(291896889, 2024-05-20 04:30:19.039837+00) already exists.
2024-12-06 12:55:55.164 UTC [28177] CONTEXT: COPY main_inventoryupdateevent_20240520_04, line 1
2024-12-06 12:55:55.164 UTC [28177] STATEMENT: COPY public.main_inventoryupdateevent_20240520_04 (id, created, modified, event_data, uuid, counter, stdout, verbosity, start_line, end_line, inventory_update_id, job_created) FROM stdin;

2024-12-06 12:55:55.298 UTC [28177] ERROR: duplicate key value violates unique constraint "main_inventoryupdateevent_20240520_05_pkey"
2024-12-06 12:55:55.298 UTC [28177] DETAIL: Key (id, job_created)=(291933045, 2024-05-20 05:00:19.048448+00) already exists.
2024-12-06 12:55:55.298 UTC [28177] CONTEXT: COPY main_inventoryupdateevent_20240520_05, line 1
2024-12-06 12:55:55.298 UTC [28177] STATEMENT: COPY public.main_inventoryupdateevent_20240520_05 (id, created, modified, event_data, uuid, counter, stdout, verbosity, start_line, end_line, inventory_update_id, job_created) FROM stdin;`

AWX Operator version

2.15.0

AWX version

24.2.0

Kubernetes platform

openshift

Kubernetes/Platform version

v1.27.16

Modifications

yes

Steps to reproduce

Run an AWX Server with v.23.6.0 with a big postgres-13 DB (around 200GB, due to a 15 Months retention time jobs policy)
Launch the upgrade to 2.15.0 with kustomize, where the AWX CR has been previously updated to use the specific image: registry.redhat.io/rhel8/postgresql-15:latest.
At the end of the DB migration, where 197GB from 201GB have been migrated, the first "duplicate key value violates unique constraint" occurs in the postgresql Logs and reported later in the operator logs, when it retries the task.

Expected results

DB migration from postgres 13 to 15 finish properly and the upgrade is completed.

Actual results

Broken DB migration

Additional information

A specific postgresql image is in use, not the default one from sclorg:
registry.redhat.io/rhel8/postgresql-13:latest (with AWX Version: 23.6.0 and Operator Version 2.10.0)

registry.redhat.io/rhel8/postgresql-15:latest (with AWX Version: 24.2.0 and Operator Version 2.15.0)

Operator Logs

`--------------------------- Ansible Task StdOut -------------------------------

TASK [installer : Get the name of the service for the old postgres pod] ********
task path: /opt/ansible/roles/installer/tasks/upgrade_postgres.yml:65


{"level":"info","ts":"2024-12-06T09:35:22Z","logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/api/v1/namespaces/awx/services","Verb":"list","APIPrefix":"api","APIGroup":"","APIVersion":"v1","Namespace":"awx","Resource":"services","Subresource":"","Name":"","Parts":["services"]}}

--------------------------- Ansible Task StdOut -------------------------------

TASK [installer : Stream backup from pg_dump to the new postgresql container] ***
task path: /opt/ansible/roles/installer/tasks/upgrade_postgres.yml:99


{"level":"info","ts":"2024-12-06T09:35:23Z","logger":"logging_event_handler","msg":"[playbook task start]","name":"awx","namespace":"awx","gvk":"awx.ansible.com/v1beta1, Kind=AWX","event_type":"playbook_on_task_start","job":"5324155957950256468","EventData.Name":"installer : Stream backup from pg_dump to the new postgresql container"}
{"level":"info","ts":"2024-12-06T09:35:23Z","logger":"proxy","msg":"Read object from cache","resource":{"IsResourceRequest":true,"Path":"/api/v1/namespaces/awx/pods/awx-postgres-15-0","Verb":"get","APIPrefix":"api","APIGroup":"","APIVersion":"v1","Namespace":"awx","Resource":"pods","Subresource":"","Name":"awx-postgres-15-0","Parts":["pods","awx-postgres-15-0"]}}`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant