Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long import wait times in ansible-galaxy collection publish break on token expiration #70019

Open
ironfroggy opened this issue Jun 11, 2020 · 7 comments · May be fixed by #83145
Open

Long import wait times in ansible-galaxy collection publish break on token expiration #70019

ironfroggy opened this issue Jun 11, 2020 · 7 comments · May be fixed by #83145
Labels
affects_2.9 This issue/PR affects Ansible v2.9 affects_2.16 bug This issue/PR relates to a bug. galaxy has_pr This issue has an associated PR. python3 support:core This issue/PR relates to code supported by the Ansible Engineering Team.

Comments

@ironfroggy
Copy link

SUMMARY

The ansible-galaxy collection publish command, when interacting with Automation Hub, uses the Keycloak authentication protocol to fetch an access token for API requests during the session. It continues to use this token for the length of the process. When performing a collection publish operation, the client polls the import task endpoint repeatedly until getting a failed or success status on the import to report to the user.
The Keycloak access token expires after 15 minutes (900 seconds) and imports can some times take longer than this 15 minute expiration. If this is the case, the polling requests will suddenly fail with a 401 Authentication Error response. The import will appear to fail and the user will not observe the right result.

ISSUE TYPE
  • Bug Report
COMPONENT NAME

ansible-galaxy

ANSIBLE VERSION
ansible 2.9.7
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/calvin/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.7/site-packages/ansible
  executable location = /usr/local/bin/ansible
  python version = 3.7.5 (default, Oct 17 2019, 12:21:00) [GCC 8.3.1 20190223 (Red Hat 8.3.1-2)]
CONFIGURATION
DEFAULT_TIMEOUT(/etc/ansible/ansible.cfg) = 30
GALAXY_SERVER_LIST(/etc/ansible/ansible.cfg) = ['automation_hub', 'galaxy', 'galaxy_dev', 'automation_hub_ci', 'automation_h>
PERSISTENT_COMMAND_TIMEOUT(/etc/ansible/ansible.cfg) = 60
PERSISTENT_CONNECT_TIMEOUT(/etc/ansible/ansible.cfg) = 30
OS / ENVIRONMENT

Fedora 30

STEPS TO REPRODUCE

The problem will occur when the import works for Automation Hub are busy and the import has to wait in the queue before running. The best way to reproduce this would be to publish multiple collections in a very short timeframe or in parallel.

EXPECTED RESULTS

Ideally, the imports would complete faster, but in the case that they take longer, the client should continue to poll the import task and update the user when the import begins and when it finally finishes.
If the token expires, the client should fetch a fresh one and continue its operation normally.

ACTUAL RESULTS
ansible-galaxy 2.9.9.post0
  config file = /tmp/tmp3e68yyhc.cfg
  configured module search path = ['/home/calvin/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/calvin/projects/ansible/lib/ansible
  executable location = /home/calvin/.local/share/virtualenvs/iqe/bin/ansible-galaxy
  python version = 3.7.5 (default, Oct 17 2019, 12:21:00) [GCC 8.3.1 20190223 (Red Hat 8.3.1-2)]
Using /tmp/tmp3e68yyhc.cfg as config file
Publishing collection artifact '/tmp/home/calvin/projects/orion-utils/orionutils/collections/skeleton/autohubtest2-collection_dep_a_wbyqmoja-1.0.0.tar.gz' to automation_hub https://ci.cloud.redhat.com/api/automation-hub/
Collection has been published to the Galaxy server automation_hub https://ci.cloud.redhat.com/api/automation-hub/
Waiting until Galaxy import task https://ci.cloud.redhat.com/api/automation-hub/v3/imports/collections/7e4d4c36-69fc-4512-8152-6fbf4adb78d4/ has completed
Galaxy import process has a status of waiting, wait 2 seconds before trying again
Galaxy import process has a status of waiting, wait 3 seconds before trying again
Galaxy import process has a status of waiting, wait 4 seconds before trying again
Galaxy import process has a status of waiting, wait 6 seconds before trying again
Galaxy import process has a status of waiting, wait 10 seconds before trying again
Galaxy import process has a status of waiting, wait 15 seconds before trying again
Galaxy import process has a status of waiting, wait 22 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
Galaxy import process has a status of waiting, wait 30 seconds before trying again
ERROR! Error when getting import task results at https://ci.cloud.redhat.com/api/automation-hub/v3/imports/collections/45891178-ebf2-4ec7-81f4-db7323464f54/ (HTTP Code: 401, Message: Unauthorized Code: Unknown)
@ansibot
Copy link
Contributor

ansibot commented Jun 11, 2020

Files identified in the description:

If these files are incorrect, please update the component name section of the description or use the !component bot command.

click here for bot help

@ansibot ansibot added affects_2.9 This issue/PR affects Ansible v2.9 bug This issue/PR relates to a bug. needs_triage Needs a first human triage before being processed. python3 support:core This issue/PR relates to code supported by the Ansible Engineering Team. labels Jun 11, 2020
@samdoran samdoran added galaxy needs_verified This issue needs to be verified/reproduced by maintainer and removed needs_triage Needs a first human triage before being processed. labels Jun 11, 2020
@samdoran
Copy link
Contributor

cc @jborean93

@pabelanger
Copy link
Contributor

@thedoubl3j
Copy link
Member

Tower team are also reporting this issue as of today
cc: @kdelee @Spredzy

@jborean93
Copy link
Contributor

While this should still be fixed, if the client does time out while waiting for the import with this error the import does continue in the background so the publish will continue. You will just have to manually wait for the import steps to complete.

@pabelanger
Copy link
Contributor

pabelanger commented Nov 30, 2020

While this should still be fixed, if the client does time out while waiting for the import with this error the import does continue in the background so the publish will continue. You will just have to manually wait for the import steps to complete.

While that is true, passing --no-wait would be the better option. In this case, we actively want to wait for AH to return the import result. As such, there is a cap of 15mins right now, and we have to race AH to get the collection processed by then.

@pabelanger
Copy link
Contributor

We are sadly seeing this issue again today while publishing content from Zuul to automation hub.

https://dashboard.zuul.ansible.com/t/ansible/build/6f7ef275134b43c5af426720492bece2/log/job-output.txt#845

It would be very helpful to have ansible-galaxy CLI refresh the token while dealing with long import times on automation hub.

As a user, there is no good solution for us when we run into this, as so far each instances has been Automation hub usually dealing with slow imports.

@ansibot ansibot added the has_pr This issue has an associated PR. label Oct 14, 2022
@mattclay mattclay added affects_2.16 and removed needs_verified This issue needs to be verified/reproduced by maintainer labels Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects_2.9 This issue/PR affects Ansible v2.9 affects_2.16 bug This issue/PR relates to a bug. galaxy has_pr This issue has an associated PR. python3 support:core This issue/PR relates to code supported by the Ansible Engineering Team.
Projects
None yet
7 participants