-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CORS-3483: Update CAPI and CAPZ versions to set Machine DisableExtensionOperations #8627
Conversation
/cc @jhixson74 , @patrickdillon, @rna-afk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somehow changing CAPA and breaking the build:
pkg/asset/machines/aws/awsmachines.go:119:21: awsMachine.Spec.ElasticIPPool undefined (type "sigs.k8s.io/cluster-api-provider-aws/v2/api/v1beta2".AWSMachineSpec has no field or method ElasticIPPool)
pkg/asset/machines/aws/awsmachines.go:119:43: undefined: capa.ElasticIPPool
pkg/asset/machines/aws/awsmachines.go:121:47: undefined: capa.PublicIpv4PoolFallbackOrderAmazonPool
# github.com/openshift/installer/pkg/asset/machines/aws
# [github.com/openshift/installer/pkg/asset/machines/aws]
vet: pkg/asset/machines/aws/awsmachines.go:119:21: awsMachine.Spec.ElasticIPPool undefined (type v1beta2.AWSMachineSpec has no field or method ElasticIPPool)
Also capz is failing to build:
|
sigs.k8s.io/yaml v1.4.0 // indirect | ||
) | ||
|
||
replace sigs.k8s.io/cluster-api => sigs.k8s.io/cluster-api v1.6.3 | ||
replace sigs.k8s.io/cluster-api => sigs.k8s.io/cluster-api v1.7.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be safe, this should never be higher than the version of capi we're building (1.7.0 as of today).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updating CAPI version at top level go.mod and cluster-api/cluster-api/go.mod.
But not updating it for other CAPI providers.
The version of CAPZ we are pulling in uses CAPI 1.7.3. I am not sure of the consequences if we just bumped CAPZ version and not CAPI version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If CAPZ uses a new feature in 1.7.3, then running it against an older capi could cause runtime failures. I think what you're doing now is the correct approach. Let's just make sure to trigger jobs for all other providers and make sure they work with capi 1.7.3.
/test ? |
@r4f4: The following commands are available to trigger required jobs:
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/test altinfra-e2e-aws-ovn altinfra-e2e-azure-capi-ovn altinfra-e2e-gcp-capi-ovn altinfra-e2e-ibmcloud-capi-ovn altinfra-e2e-nutanix-capi-ovn altinfra-e2e-openstack-capi-ovn altinfra-e2e-powervs-capi-ovn altinfra-e2e-vsphere-capi-ovn |
|
@sadasu: This pull request references CORS-3483 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.17.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/jira refresh |
@sadasu: This pull request references CORS-3483 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.17.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/cc |
|
It'd be good to double check if it's a concern. |
It's a LGTM if the changes are not negatively impacting azure. |
looking at the alintfra-e2e-azure-capi-ovn job infrastructure is not becoming ready. I would have expected this to fail at waiting for machines, although looking at the job history, some previous jobs were actually able to complete installs. So seems like something is off in this PR to fail at this stage. |
Looking at the failure seen in this PR in isolation (not comparing it with previous failures), I see this:
So, this error is generated here: kubernetes/client-go@147848c |
Yup. This is definitely an issue. I spent some time debugging it but am not familiar enough with cluster-api to get to the bottom of it at this time. It looks like things aren't caching correctly or in time. I decided to try and pinpoint when the changes took place that broke this and determined that to be between v1.6.6 and v1.7.0. Unless there is a need to go to v1.7.0 right now, it would be nice to get this in at v1.6.6 and debug this later when we have more time ;-) |
Looking at this azure run from this PR I don't see that error line even though it's using capi 1.7.0. I would expect the issue to be between From what I can tell, the error happens when running the azure controllers, specifically the
That controller only runs if the machinePool feature gate is The worrisome part about this log is:
So azure controllers are shutting down (via a context cancel?) and nothing progresses for 13min until the Installer shuts envtest down at the 15min network infra timeout . |
Setting the DisableExtensionOperations to true to reduce the chances of seeing timeout issues while provisioning VMs for Azure using CAPZ.
/test altinfra-e2e-azure-capi-ovn |
I just tested this locally and it works!
Let's try to get this in as soon as possible, because it does not look fun to rebase this... /approve |
/retest-required |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: patrickdillon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test altinfra-e2e-gcp-capi-ovn altinfra-e2e-ibmcloud-capi-ovn altinfra-e2e-nutanix-capi-ovn altinfra-e2e-openstack-capi-ovn altinfra-e2e-powervs-capi-ovn altinfra-e2e-vsphere-capi-ovn |
/test e2e-aws-ovn e2e-vsphere-ovn |
azure install has succeeded. just waiting for e2e tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/retest-required |
Confirmed here as well. |
/retest-required |
/skip |
@sadasu: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
[ART PR BUILD NOTIFIER] This PR has been included in build ose-installer-altinfra-container-v4.17.0-202406220812.p0.g27d9113.assembly.stream.el9 for distgit ose-installer-altinfra. |
Updating CAPI to version v1.7.0 and CAPZ to v1.15.1-0.20240617212811-a52056dfb88c
The CAPZ version bump bring in kubernetes-sigs/cluster-api-provider-azure#4792
And set DisableExtensionOperations to true.
Co-authored-by: Aditya Narayanaswamy [email protected]