Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run OpenShift reliability/minimal test suite #456

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

praveenkumar
Copy link
Member

Rather than kubernetes/conformance, this attempts to run
openshift reliability/minimal tests. This also adds some tests to
ignore list which was failing during testing due to apiserver
unavailblity or single node constrain. Plan is to reduce this ignore
list and also try to add openshift conformance test in future.

This is not supersede #399 but a different subset of tests which is part of openshift instead plain k8s.

@openshift-ci openshift-ci bot requested review from cfergeau and gbraad July 27, 2021 14:32
@praveenkumar
Copy link
Member Author

/retest

3 similar comments
@praveenkumar
Copy link
Member Author

/retest

@praveenkumar
Copy link
Member Author

/retest

@praveenkumar
Copy link
Member Author

/retest

@cfergeau
Copy link
Contributor

Compared to kubernetes/conformance, this removes 103 tests, and adds 718, so it's a very different suite.
However, openshift/conformance only removes one test when compared to experimental/reliability/minimal (and adds 2065, but that makes sense).
Is this why you'd favour experimental/reliability/minimal over kubernetes/conformance? The experimental naming is making me a bit reluctant with this.

Copy link
Contributor

@cfergeau cfergeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of questions, but overall looks good!

ci.sh Outdated
@@ -47,7 +47,7 @@ sudo mv out/linux-amd64/crc /usr/local/bin/
popd

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expand the commit log a bit, Giving more resources to the cluster we run the testsuite on should hopefully prevent some disruptions/timeouts/... because of low memory conditions

snc.sh Outdated
# Remove the openshift-authenticator CSR to force a certificate regeneration
# BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1978193
retry ${OC} delete csr system:openshift:openshift-authenticator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we made a 4.8 release without this, I'm not sure what is the impact of this? Surely this deserves to be in a PR of its own?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cfergeau we made release of 4.8 using this c100967 but for 4.9 bug is recently resolved so we don't need this change anymore.

@praveenkumar
Copy link
Member Author

Compared to kubernetes/conformance, this removes 103 tests, and adds 718, so it's a very different suite.
However, openshift/conformance only removes one test when compared to experimental/reliability/minimal (and adds 2065, but that makes sense).
Is this why you'd favour experimental/reliability/minimal over kubernetes/conformance? The experimental naming is making me a bit reluctant with this.

@cfergeau I am considering 2 things when going with this suite.

  1. Coverage for openshift specific test cases increase from 0 to something.
  2. Duration of a successful run is around 1 hour 30 mins instead 2-3 hours in case of complete conformance test suite.

About naming I need to confirm with openshift team but as per description it says Set of highly reliable tests. which makes me bit more confident of using it.

About removed 103 tests from kubernetes/conformance is a bit concerning :(

@praveenkumar
Copy link
Member Author

/test e2e-snc

1 similar comment
@praveenkumar
Copy link
Member Author

/test e2e-snc

@cfergeau
Copy link
Contributor

cfergeau commented Aug 2, 2021

About naming I need to confirm with openshift team but as per description it says Set of highly reliable tests. which makes me bit more confident of using it.

It's relatively recent, maybe that explains the experimental in the naming. A bit of digging gives a more detailed description/intent:
openshift/origin#25946

Adds a list of tests that pass with extremely high reliability. We will run these tests in a new e2e job. The origina list was generated by taking tests that have passed at a rate of 99% or higher over the last week, from sippy data.

Disappointing that we still need to disable ~20 tests from this list :(

About removed 103 tests from kubernetes/conformance is a bit concerning :(

Most of these are readded when moving from reliability/minimal to openshift/conformance so this seems intentional, these are just the 'less reliable' ones.

Duration of a successful run is around 1 hour 30 mins instead 2-3 hours in case of complete conformance test suite.

But only the complete conformance test suite gives us some "this behaves as an OpenShift cluster" label, so we'll have to take these 2-3h at some point. Running a subset of these tests is better than nothing, but is not our end goal :)

@cfergeau
Copy link
Contributor

cfergeau commented Aug 2, 2021

Separate retry with pipe operations and Add retry for pv creation commands are already tracked in #457 , can they be split out from this PR?

@praveenkumar
Copy link
Member Author

Disappointing that we still need to disable ~20 tests from this list :(

  • Some of them from storage side which we are not supporting (volume snapshot stuff)
  • Matrix specific test cases because we don't enable monitoring
  • oc must gather one which also fails because of volume snapshots issues (This I really need to understand and create a BZ)
  • Deamon set specific one which really 2 nodes to do testing (I am not sure why it doesn't skip)

But only the complete conformance test suite gives us some "this behaves as an OpenShift cluster" label, so we'll have to take these 2-3h at some point. Running a subset of these tests is better than nothing, but is not our end goal :)

Yes this is the end goal, I am hoping we can piggy back on SNO team in near future to try to run complete suite because if they need a bump on CI time then we can directly benefited from it.

@praveenkumar
Copy link
Member Author

@cfergeau I run tests on this PR 3 times and no single failure https://prow.ci.openshift.org/job-history/gs/origin-ci-test/pr-logs/directory/pull-ci-code-ready-snc-master-e2e-snc so I am bit confident to have it.

@praveenkumar
Copy link
Member Author

/retest

Copy link
Contributor

@cfergeau cfergeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could go with this for now.

@openshift-ci
Copy link

openshift-ci bot commented Sep 1, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cfergeau, praveenkumar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [cfergeau,praveenkumar]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link

openshift-ci bot commented Sep 2, 2021

New changes are detected. LGTM label has been removed.

@openshift-ci openshift-ci bot removed the lgtm label Sep 2, 2021
@praveenkumar
Copy link
Member Author

/retest

Rather than kubernetes/conformance, this attempts to run
openshift reliability/minimal tests. This also adds some tests to
ignore list which was failing during testing due to apiserver
unavailblity or single node constrain. Plan is to reduce this ignore
list and also try to add openshift conformance test in future.

```
experimental/reliability/minimal
  Set of highly reliable tests.
```
@praveenkumar
Copy link
Member Author

/retest

@praveenkumar
Copy link
Member Author

/retest

2 similar comments
@praveenkumar
Copy link
Member Author

/retest

@praveenkumar
Copy link
Member Author

/retest

@praveenkumar
Copy link
Member Author

/hold

@openshift-ci
Copy link

openshift-ci bot commented Jul 12, 2023

@praveenkumar: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-snc 799dc0d link true /test e2e-snc
ci/prow/e2e-microshift 799dc0d link true /test e2e-microshift

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Hold
Development

Successfully merging this pull request may close these issues.

2 participants