Skip to content

Latest commit

 

History

History
1529 lines (1168 loc) · 64.7 KB

UsersGuide.asciidoc

File metadata and controls

1529 lines (1168 loc) · 64.7 KB

openQA users guide

Table of Contents

Introduction

This document provides additional information for use of the web interface or the REST API as well as administration information. For administrators it is recommend to have read the Installation Guide first to understand the structure of components as well as the configuration of an installed instance.

Using job templates to automate jobs creation

The problem

When testing an operating system, especially when doing continuous testing, there is always a certain combination of jobs, each one with its own settings, that needs to be run for every revision. Those combinations can be different for different 'flavors' of the same revision, like running a different set of jobs for each architecture or for the Full and the Lite versions. This combinational problem can go one step further if openQA is being used for different kinds of tests, like running some simple pre-integration tests for some snapshots combined with more comprehensive post-integration tests for release candidates.

This section describes how an instance of openQA can be configured using the options in the admin area to automatically create all the required jobs for each revision of your operating system that needs to be tested. If you are starting from scratch, you should probably go through the following order:

  1. Define machines in 'Machines' menu

  2. Define medium types (products) you have in 'Medium types' menu

  3. Specify various collections of tests you want to run in the 'Test suites' menu

  4. Define job groups in 'Job groups' menu for groups of tests

  5. Select individual 'Job groups' and decide what combinations make sense and need to be tested

Machines, mediums, test suites and job templates can all set various configuration variables. The so called job templates within the job groups define how the test suites, mediums and machines should be combined in various ways to produce individual 'jobs'. All the variables from the test suite, medium, machine and job template are combined and made available to the actual test code run by the 'job', along with variables specified as part of the job creation request. Certain variables also influence openQA’s and/or os-autoinst’s own behavior in terms of how it configures the environment for the job. Variables that influence os-autoinst’s behavior are documented in the file doc/backend_vars.asciidoc in the os-autoinst repository.

In openQA we can parameterize a test to describe for what product it will run and for what kind of machines it will be executed. For example, a test suite kde can be run for any product that has the KDE software stack installed, like openSUSE-DVD-x86_64 and openSUSE-NET-i586, and can be tested in different x86-64 and i586 machines like 64bit, 64bit_USBBoot, 32bit. In this example we could have the following test scenarios considering that the “x86_64” flavor is not compatible with the 32bit machine:

  • openSUSE-DVD-x86_64-kde-64bit

  • openSUSE-DVD-x86_64-kde-64bit_USBBoot

  • openSUSE-NET-i586-kde-64bit

  • openSUSE-NET-i586-kde-64bit_USBBoot

  • openSUSE-NET-i586-kde-32bit

For every test scenario we need to configure a different instance of the test backend, for example os-autoinst, with a different set of parameters.

Machines

You need to have at least one machine set up to be able to run any tests. Those machines represent virtual machine types that you want to test. To make tests actually happen, you have to have an 'openQA worker' connected that can fulfill those specifications.

  • Name. User defined string - only needed for operator to identify the machine configuration.

  • Backend. What backend should be used for this machine. Recommended value is qemu as it is the most tested one, but other options (such as kvm2usb or vbox) are also possible.

  • Variables Most machine variables influence os-autoinst’s behavior in terms of how the test machine is set up. A few important examples:

    • QEMUCPU can be 'qemu32' or 'qemu64' and specifies the architecture of the virtual CPU.

    • QEMUCPUS is an integer that specifies the number of cores you wish for.

    • LAPTOP if set to 1, QEMU will create a laptop profile.

    • USBBOOT when set to 1, the image will be loaded through an emulated USB stick.

Medium Types (products)

A medium type (product) in openQA is a simple description without any concrete meaning. It basically consists of a name and a set of variables that define or characterize this product in os-autoinst.

Some example variables used by openSUSE are:

  • ISO_MAXSIZE contains the maximum size of the product. There is a test that checks that the current size of the product is less or equal than this variable.

  • DVD if it is set to 1, this indicates that the medium is a DVD.

  • LIVECD if it is set to 1, this indicates that the medium is a live image (can be a CD or USB)

  • GNOME this variable, if it is set to 1, indicates that it is a GNOME only distribution.

  • PROMO marks the promotional product.

  • RESCUECD is set to 1 for rescue CD images.

Test Suites

A test suite consists of a name and a set of test variables that are used inside this particular test together with an optional description. The test variables can be used to parameterize the actual test code and influence the behaviour according to the settings.

Some sample variables used by openSUSE are:

  • BTRFS if set, the file system will be BtrFS.

  • DESKTOP possible values are 'kde' 'gnome' 'lxde' 'xfce' or 'textmode'. Used to indicate the desktop selected by the user during the test.

  • DOCRUN used for documentation tests.

  • DUALBOOT dual boot testing, needs HDD_1 and HDDVERSION.

  • ENCRYPT encrypt the home directory via YaST.

  • HDDVERSION used together with HDD_1 to set the operating system previously installed on the hard disk.

  • INSTALLONLY only basic installation.

  • INSTLANG installation language. Actually used only in documentation tests.

  • LIVETEST the test is on a live medium, do not install the distribution.

  • LVM select LVM volume manager.

  • NICEVIDEO used for rendering a result video for use in show rooms, skipping ugly and boring tests.

  • NOAUTOLOGIN unmark autologin in YaST

  • NUMDISKS total number of disks in QEMU.

  • REBOOTAFTERINSTALL if set to 1, will reboot after the installation.

  • SCREENSHOTINTERVAL used with NICEVIDEO to improve the video quality.

  • SPLITUSR a YaST configuration option.

  • TOGGLEHOME a YaST configuration option.

  • UPGRADE upgrade testing, need HDD_1 and HDDVERSION.

  • VIDEOMODE if the value is 'text', the installation will be done in text mode.

Some of the variables usually set in test suites that influence openQA and/or os-autoinst’s own behavior are:

  • HDDMODEL variable to set the HDD hardware model

  • HDDSIZEGB hard disk size in GB. Used together with BtrFS variable

  • HDD_1 path for the pre-created hard disk

  • RAIDLEVEL RAID configuration variable

  • QEMUVGA parameter to declare the video hardware configuration in QEMU

Job Groups

The job groups are the place where the actual test scenarios are defined by the selection of the medium type, the test suite and machine together with a priority value.

The priority value is used in the scheduler to choose the next job. If multiple jobs are scheduled and their requirements for running them are fulfilled the ones with a lower priority value are triggered. The id is the second sorting key: Of two jobs with equal requirements and same priority value the one with lower id is triggered first.

Job groups themselves can be created over the web UI as well as the REST API. Job groups can optionally be nested into categories. The display order of job groups and categories can be configured by drag-and-drop in the web UI.

The scenario definitions within the job groups can be created and configured by different means:

  • A simple web UI wizard which is automatically shown for job groups when a new medium is added to the job group.

  • An intuitive table within the web UI for adding additional test scenarios to existing media including the possibility to configure the priority values.

  • The scripts openqa-load-templates and openqa-dump-templates to quickly dump and load the configuration from custom plain-text dump format files using the REST API.

  • Using declarative schedule definitions in the YAML format using REST API routes or an online-editor within the web UI including a syntax checker.

Variable expansion

Any variable defined in Test Suite, Machine, Product or Job Template table can refer to another variable using this syntax: %NAME%. When the test job is created, the string will be substituted with the value of the specified variable at that time.

For example this variable defined for Test Suite:

PUBLISH_HDD_1 = %DISTRI%-%VERSION%-%ARCH%-%DESKTOP%.qcow2

may be expanded to this job variable:

PUBLISH_HDD_1 = opensuse-13.1-i586-kde.qcow2

Variable precedence

It’s possible to define the same variable in multiple places that would all be used for a single job - for instance, you may have a variable defined in both a test suite and a product that appear in the same job template. The precedence order for variables is as follows (from lowest to highest):

  • Product

  • Machine

  • Test suite

  • Job template

  • API POST query parameters

That is, variable values set as part of the API request that triggers the jobs will 'win' over values set at any of the other locations. In the special case of the BACKEND variable, if there is a MACHINE specified, the BACKEND value for this machine defined in openQA has highest precedence.

If you need to override this precedence - for example, you want the value set in one particular test suite to take precedence over a setting of the same value from the API request - you can add a leading + to the variable name. For instance, if you set +VARIABLE = foo in a test suite, and passed VARIABLE=bar in the API request, the test suite setting would 'win' and the value would be foo.

If the same variable is set with a + prefix in multiple places, the same precedence order described above will apply to those settings.

Note that the WORKER_CLASS variable is not overridden in the way described above. Instead multiple occurrences are combined.

Use of the web interface

In general the web UI should be intuitive or self-explanatory. Look out for the little blue help icons and click them for detailed help on specific sections.

Some pages use queries to select what should be shown. The query parameters are generated on clickable links, for example starting from the index page or the group overview page clicking on single builds. On the query pages there can be UI elements to control the parameters, for example to look for more older builds or only show failed jobs or other settings. Additionally, the query parameters can be tweaked by hand if you want to provide a link to specific views.

Description of test suites

Test suites can be described using API commands or the admin table for any operator using the web UI.

test suite description edit field
Figure 1. Entering a test suite description in the admin table using the web interface:

If a description is defined, the name of the test suite on the tests overview page shows up as a link. Clicking the link will show the description in a popup. The same syntax as for comments can be used, that is Markdown with custom extensions such as shortened links to ticket systems.

test suite description popup
Figure 2. popover in test overview with content as configured in the test suites database:

/tests/overview - Customizable test overview page

The overview page is configurable by the filter box. Also, some additional query parameters can be provided which can be considered advanced or experimental. For example specifying no build will resolve the latest build which matches the other parameters specified. Specifying no group will show all jobs from all matching job groups. Also specifying multiple groups works, see the following example.

test overview page showing multiple groups
Figure 3. The openQA test overview page showing multiple groups at once. The URL query parameters specify the groupid parameter two times to resolve both the "opensuse" and "opensuse test" group.

Specifying multiple groups with no build will yield the result for the latest build of each group. This can be useful to have a static URL for bookmarking.

Review badges

Based on comments in the individual job results for each build a certificate icon is shown on the group overview page as well as the index page to indicate that every failure has been reviewed, e.g. a bug reference or a test issue reason is stated:

Review badges

Meaning of the different colors

  • No icon is shown if at least one failure still need to be reviewed.

  • The green tick icon shows up when there is no work to be done.

  • The black certificate icon is shown if all review work has been done.

  • The grey comment icon is shown if all failures have at least one comment.

(To simplify, checking for false-negatives is not considered here.)

Bug references, labels and flags

Bug references

It is possible to reference a bug by writing <bugtracker_shortname>#<bug_nr> in a comment, e.g. bsc#1234. It is also possible to spell out the full URL, e.g. https://bugzilla.suse.com/show_bug.cgi?id=1234 which will then be shortened automatically. A bug reference is rendered as link and a bug icon is displayed for the job in various places as shown in the figure below. A comment containing a bug reference will also be carried over to reduce manual review work. Refer to the Flags section below for other ways to trigger automated comment carryover.

Warning
If you want to reference a bug without making it count as a bug reference you need to wrap it into a label (see subsequent section), e.g. label:bsc#1234.
Bug icon on test result overview
Figure 4. Bug icon for job with bug reference on test result overview

All bug references are stored within the internal database of openQA. The status can be updated using the /bugs API route with external tools. One can set the bug status this way which will then be shown in the web UI, see the figure below.

Example for visualization of closed issues
Figure 5. Example for visualization of closed issues: The upside down icons in red visualize closed issues.
Note
Also GitHub pull requests and issues can be linked. Use the generic format <marker>[<project/repo>]<id>, e.g. gh#os-autoinst/openQA#1234.

Labels

A comment can also contain labels. Use label:<keyword> where <keyword> can be any valid character up to the next whitespace, e.g. false_positive. A label is rendered as yellow box. The keywords are not defined within openQA itself. A valid list of keywords should be decided upon within each project or environment of one openQA instance. If a job has a label a special icon will be shown next to it in various places as shown in the figure below.

Label icon on test result overview
Figure 6. Label icon for job with a label on test result overview
Note
A label containing a bug reference will still be treated as a label, not a bugref. The bugref will still be rendered as a link. That means no bug icon is shown and the comment does not become subject to carry over.
Overwrite result of job

One special label format is available which allows to forcefully overwrite the result of an openQA job using a convenient openQA comment. The expected format is label:force_result:<new_result>[:<description>], for example label:force_result:failed or label:force_result:softfailed:bsc#1234. For this command to be effective the according user needs to have at least operator permissions.

Note
force_result-labels are evaluated when when a comment is carried over. However, the carry over will only happen when the comment also contains a bug reference or flag:carryover.

Flags

Currently there is only one flag for job comments supported.

flag:carryover

Adding flag:carryover to a comment, will result in this comment being carried over to a new job failing for the same reason, without a bugref required.

Distinguish product and test issues bugref gh#708

“progress.opensuse.org” is used to track test issues, bugzilla for product issues, at least for SUSE/openSUSE. openQA bugrefs distinguish this and show corresponding icons

Different icons for product and test issues

Build tagging

Tag builds with special comments on group overview

Based on comments on the group overview individual builds can be tagged. As 'build' by themselves do not own any data the job group is used to store this information. A tag has a build to link it to a build. It also has a type and an optional description. The type can later on be used to distinguish tag types. Note that openQA does not define further tag types besides the important tag. However, the user is free to choose any tag type as needed.

The generic format for tags is

tag:<build_id>:<type>[:<description>], e.g. tag:1234:important:Beta1.

The build_id should be set to the BUILD setting of the jobs (without the Build-prefix shown in dashboard pages). It is also possible to include the VERSION setting which then needs to be prepended and sparated by a dash (e.g. tag:15-SP5-25.1:important:Alpha-202210-1 where 15-SP5 is the VERSION and 25.1 the BUILD).

The more recent tag always wins. Tags specifying the VERSION as well win over generic tags.

A 'tag' icon is shown next to tagged builds together with the description on the group_overview page. The index page does not show tags by default to prevent a potential performance regression. Tags can be enabled on the index page using the corresponding option in the filter form at the bottom of the page.

Example of a tag comment and corresponding tagged build

Keeping important builds

As builds can now be tagged we come up with the convention that the 'important' type - the only one for now - is used to tag every job that corresponds to a build as 'important' and keep the logs for these jobs longer so that we can always refer to the attached data, e.g. for milestone builds, final releases, jobs for which long-lasting bug reports exist, etc.

Filtering test results and builds

At the top of the test results overview page is a form which allows filtering tests by result, architecture and TODO-status. "TODO" means that tests still require review.

Filter form

There is also a similar form at the bottom of the index page which allows filtering builds by group and customizing the limits. Also the 'All tests' table allows filtering by the TODO-status.

Highlighting job dependencies in 'All tests' table

When hovering over the branch icon after the test name children of the job will be highlighted blue and parents red. So far this only works for jobs displayed on the same page of the table.

highlighted child jobs

Show previous results in test results page gh#538

On a tests result page there is a tab for “Next & previous results” showing the result of test runs in the same scenario. This shows next and previous builds as well as test runs in the same build. This way you can easily check and compare results from before including any comments, labels, bug references (see next section). This helps to answer questions like “Is this a new issue”, “Is it reproducible”, “has it been seen in before”, “how does the history look like”.

Querying the database for former test runs of the same scenario is a rather costly operation which we do not want to do for multiple test results at once but only for each individual test result (1:1 relation). This is why this is done in each individual test result and not for a complete build.

Related issue: #10212

Screenshot of the feature:

Next and previous job results

Find the always latest job in a scenario with the link after the scenario name in the tab “Next & previous results” Screenshot:

Link to latest in scenario

Add `latest' query route gh#815

Should always refer to most recent job for the specified scenario.

  • have the same link for test development, i.e. if one retriggers tests, the person has to always update the URL. If there would be a static URL even the browser can be instructed to reload the page automatically

  • for linking to the always current execution of the last job within one scenario, e.g. to respond faster to the standard question in bug reports “does this bug still happen?”

Examples:

  • tests/latest?distri=opensuse&version=13.1&flavor=DVD&arch=x86_64&test=kde&machine=64bit

  • tests/latest?flavor=DVD&arch=x86_64&test=kde

  • tests/latest?test=foobar - this searches for the most recent job using test_suite `foobar' covering all distri, version, flavor, arch, machines. To be more specific, add the other query entries.

Allow group overview query by result gh#531

This allows e.g. to show only failed builds. Could be included like in http://lists.opensuse.org/opensuse-factory/2016-02/msg00018.html for “known defects”.

Example: Add query parameters like …&result=failed&arch=x86_64 to show only failed for the single architecture selected.

Add web UI controls to select more builds in group_overview gh#804

The query parameter `limit_builds' allows to show more than the default 10 builds on demand. Just like we have for configuring previous results, the current commit adds web UI selections to reload the same page with higher number of builds on demand. For this, the limit of days is increased to show more builds but still limited by the selected number.

Example screenshot:

Select different limit for number of displayed builds

More query parameters for configuring last builds gh#575

By using advanced query parameters in the URLs you can configure the search for builds. Higher numbers would yield more complex database queries but can be selected for special investigation use cases with the advanced query parameters, e.g. if one wants to get an overview of a longer history. This applies to both the index dashboard and group overview page.

Example to show up to three week old builds instead of the default two weeks with up to 20 builds instead of up to 10 being the default for the group overview page:

http://openqa/group_overview/1?time_limit_days=21&limit_builds=20

Web UI controls to filter only tagged or all builds gh#807

Using a new query parameter `only_tagged=[0|1]' the list can be filtered, e.g. show only tagged (important) builds.

Example screenshot:

Show only tagged or all builds

Related issue: #11052

Test result badges gh#5022

For each job result including the latest job result page, there is a corresponding route to get an SVG status badge that can eg. be used to build a status dashboard or for showing the status within a GitHub comment.

Test result badges
http://openqa/tests/123/badge
http://openqa/tests/latest/badge

There is an optional parameter 'show_build=1' that will prefix the status with the build number.

Carry over of bug references from previous jobs in same scenario

Many test failures within the same scenario might be due to the same reason. To avoid human reviewers having to add the same bug references again and again, bug references are carried over from previous failures in the same scenario if a job fails. The same behaviour can be achieved by adding flag:carryover to a comment. This idea is inspired by the Claim plugin for Jenkins.

Note
The carry-over feature works on test module level. Only if the same set of test module as in a predecessor job fails the latest bug reference is carried over.
Note
The lookup-depth is limited. The search for candidates will also stop early if too many different kinds of failures were seen. Checkout the descriptions of the relevant settings in the carry_over section of openqa.ini for details.
Note
For an approach to add bug references based on a search expression found in the job reason for incomplete jobs or job logs consider to Enable custom hook scripts on "job done" based on result.

Pinning comments as group description

This is possible by adding the keyword pinned-description anywhere in a comment on the group overview page. Then the comment will be shown at the top of the group overview page. However, it only works as operator or admin.

Dark mode

A dark mode theme can be enabled via "Appearance" settings for all logged in users. It can either be forced with the "dark mode" setting, or left to browser detection. Switching automatically between light and dark mode is natively supported by most modern browsers and can also be controlled manually via flags:

  • On Firefox, go to about:preferences#general and search for "Website appearance".

  • On Chrome, go to chrome://flags/ and search for "Dark mode".

Developer mode

The developer mode allows to:

  • Create or update needles from assert_screen mismatches ("re-needling")

  • Pause the test execution (at a certain module) for manual investigation of the SUT

It can be accessed via the "Live View" tab of a running test. Only registered users can take control over tests. Basic instructions and buttons providing further information about the different options are already contained on the web page itself. So I am not repeating that information here and rather explain the overall workflow.

In case the developer mode in not working on your instance, try to follow the steps for debugging the developer mode under 'Pitfalls'.

Workflow for creating or updating needles

  1. In case a new needles should be created, add the corresponding assert_screen calls to your test.

  2. Start the test with the assert_screen calls which are supposed to fail.

  3. Select "assert_screen timeout" under "Pause on screen mismatch" and confirm.

  4. Wait until the test has paused. There is a button to skip the current timeout to speed this up.

  5. A button for accessing the needle editor should occur. It may take a few seconds till it occurs because the screenshots created so far need to be uploaded from the worker to the web UI. Of course it is also possible to go back to the "Details" tab to create a new needle from any previous screenshot/match available.

  6. After creating the new needle, click the resume button to test whether it worked.

Steps 4. to 6. can be repeated for further needles without restarting the test.

Job group editor gh#2111

Scenarios are defined as part of a job group. The Edit job group button exposes the editor.

YAML job templates editor

Settings can be specified as a key/value pair for each scenario. There is no equivalent in the table view so you need to migrate groups to use this feature.

Any settings specified on test suites, machines or products are also used and can still be modified independently. However, the YAML document should be updated before renaming or deleting test suites, products or machines used by it, otherwise that would create an inconsistent state.

Job groups can be updated through the YAML editor or the YAML-related REST API routes.

Deprecated: Table-based (pre-migration)

In old versions openQA had a table-based UI for defining job templates, listed in a table per medium. Machines can be added by selecting the architecture column and picking a machine from the list. Remove scenarios by removing all of their machines. Add new scenarios via the blue Plus icon at the top of the table. Changes to the priority are applied immediately.

If job groups still exist showing the old mode, the Edit YAML button can be used to reveal the YAML editor and migrate a group. After saving for the first time, the group can only be configured in YAML. The table view will not be shown anymore.

Note that making a backup before migrating groups may be a good idea, for example using openqa-dump-templates.

To migrate an old job group using the API the current schedule can be retrieved in YAML format and sent back to save as a complete YAML document. For example for all job groups in the old format:

for i in $(ssh openqa.example.com "sudo -u geekotest psql --no-align --tuples-only --command=\"select id from job_groups where template is null order by id;\" openqa") ; do
    curl -s http://openqa.example.com/api/v1/job_templates_scheduling/$i | openqa-cli api --host http://openqa.example.com -X POST job_templates_scheduling/$i schema=JobTemplates-01.yaml template="$(cat -)"
done

Note that in some cases you might run into errors where old test suites or products have invalid names which the old editor did not enforce:

Product names may not contain : or @ characters. Something like Server-DVD-Staging:A would require replacing the : with eg. a -.

Test suites may not contain : or @ characters. A test suite such as ext4_uefi@staging would have been allowed previously. The use of the @ as a suffix could be replaced with a - or if it is used for variants of the same test suite with different settings, settings can be specified in YAML directly.

More generally the regular expression [A-Za-z0-9._*-]+ could be used to check if a name is allowed for a product or test suite.

Configuring job groups via YAML documents

A new job group starts out empty, which in YAML means that the two mandatory sections are present but contain nothing. This is what can be seen when editing a completely group, and what is also the state to revert to before deleting a job group that is no longer useful:

products: {}
scenarios: {}

A job group is comprised of up to three main sections. products defines one or more mediums to run the scenarios in the group. At least one needs to be specified to be able to run tests. Going by an example of openSUSE 15.1 the name, distri, flavor and version could be written like so. Note that the version is a string in single quotes.

products:
  opensuse-15.1-DVD-Updates-x86_64:
    distri: opensuse
    flavor: DVD-Updates
    version: '15.1'

To complete the job group at least one scenario has to be added. A scenario is a combination of a test suite, a machine and an architecture. Scenarios must also be unique across job groups - trying to add it to multiple job groups is an error. Case in point, textmode and gnome could be defined like so:

scenarios:
  x86_64:
    opensuse-15.1-DVD-Updates-x86_64:
    - textmode
    - gnome:
      machine: uefi
      priority: 70
      settings:
        QEMUVGA: cirrus

Defaults

Now there are two scenarios for x86_64, one by giving just the name of the test suite and another which has a machine, priority and settings. Both are allowed. However since at least one scenario relies on defaults those need to be specified once in their own section:

defaults:
  x86_64:
    machine: 64bit
    priority: 50

The defaults section is only required whenever a scenario is not completely defined in-place. When it is used, the available parameters are identical to those for a single scenario. For instance the example could be amended to use settings and run every test suite for that architecture on several machines by default.

defaults:
  x86_64:
    machine: [64bit, 32bit]
    priority: 50
    settings:
      FOO: '1'

Defaults are always overwritten by explicit parameters on scenarios. Further more, all settings can be specified in YAML. Using this together with custom job template names, variants of a scenario can even be specified when they would normally be considered duplicated:

scenarios:
  x86_64:
    opensuse-15.1-DVD-Updates-x86_64:
    - textmode
    - gnome:
      machine: uefi
      priority: 70
      settings:
        QEMUVGA: cirrus
    - gnome_staging:
      testsuite: gnome
      machine: [32bit, 64bit-staging]
      settings:
        FOO: '2'

YAML Aliases

Even more flexibility can be achieved by using aliases in YAML, or in other words re-using a scenario by reference, such as to run the same scenarios in two different mediums. & is used to define an anchor, while * is the alias referencing the anchor:

products:
  opensuse-15.1-DVD-Updates-x86_64:
    distri: opensuse
    flavor: DVD-Updates
    version: '15.1'
  opensuse-15.2-GNOME-Live-x86_64:
    distri: opensuse
    flavor: GNOME-Live
    version: '15.2'
scenarios:
  x86_64:
    opensuse-15.1-DVD-Updates-x86_64:
    - textmode
    - gnome: &gnome
      machine: uefi
      priority: 70
      settings:
        QEMUVGA: cirrus
    - gnome_staging: &gnome_staging
      testsuite: gnome
      machine: [32bit, 64bit-staging]
      settings:
        FOO: '2'
    opensuse-15.2-GNOME-Live-x86_64:
    - textmode
    - gnome: *gnome
    - gnome_staging: *gnome_staging

YAML Merge Keys

Also YAML Merge Keys are supported. This way you can reuse previously defined anchors and add other values to it. Values in the merged alias will be overridden.

You can even merge more than one alias.

products:
  opensuse-15.1-DVD-Updates-x86_64:
    distri: opensuse
    flavor: DVD-Updates
    version: '15.1'
  opensuse-15.2-GNOME-Live-x86_64:
    distri: opensuse
    flavor: GNOME-Live
    version: '15.2'
scenarios:
  x86_64:
    opensuse-15.1-DVD-Updates-x86_64:
    - textmode
    - gnome:
      machine: uefi
      priority: 70
      settings: &common1
        QEMUVGA: cirrus
        FOO: default foo
    - gnome:
      machine: [32bit, 64bit-staging]
      priority: 70
      settings: &common2
        QEMUVGA: cirrus
        FOO: default foo
        BAR: default bar
    - gnome_staging:
      testsuite: gnome
      machine: [32bit, 64bit-staging]
      settings:
        # Merge
        <<: *common1
        FOO: foo # overrides the value from the merge keys
    - gnome_staging:
      testsuite: gnome
      machine: [32bit, 64bit-staging]
      settings:
        # Merge
        <<: [*common1, *common2] # *common1 overrides *common2
        FOO: foo # overrides the value from the merge keys

General YAML documentation

The job templates are written in YAML 1.2. In YAML, strings usually do not have to be quoted, except if it is a special value that would be loaded as a Boolean, NULL or Number. The following table shows all special values (See the documentation for the default YAML 1.2 Core Schema for more information).

Type Special Values

bool

true | True | TRUE | false | False | FALSE

int (Base 8)

0o7, 0o10, 0o755

Regular Expression: 0o [0-7]+

int (Base 10)

23, 42`, `0123`, `-314`

Regular Expression: `[-\]? [0-9]+

int (Base 16)

0xFF, 0xa, 0xc0ffee

Regular Expression: 0x [0-9a-fA-F]+

float (Number)

3.14, 3.14`, `-3.14`, `3.3e+3`, `3.3e3`, `.14`, `001.23`, `.3E-1`, `3e3`

Regular Expression: `[-\]? ( \. [0-9]+ | [0-9]+ ( \. [0-9]* )? ) ( [eE] [-+]? [0-9]+ )?

float (Infinity)

.inf, .inf`, `-.inf`, `.Inf` etc.

Regular Expression: `[-]? \. ( inf | Inf | INF )

float (Not a number)

.nan, .NaN, .NAN

Regular Expression: \. ( nan | NaN |NAN )

null

null | Null | NULL | ~ | # empty

str

everything else

Because we are using the Merge Keys feature, also the unquoted string << is special. If you need the literal string << (for example as a value in the job settings), you have to quote it.

Use of the REST API

openQA includes a client script which - depending on the distribution - is packaged independently to allow interfacing with an existing openQA instance without needing to install openQA itself. Call openqa-cli --help for help. The sub-commands provide further help, e.g. openqa-cli api --help contains a lot of examples.

This section focuses on particular API use-cases. Checkout the openQA client section for further information about the client itself, how authentication works and how plain curl can be used.

Finding tests

The following example lists all jobs within the job group with the ID 1 and the setting BUILD=20210707 on openqa.opensuse.org:

curl -s "https://openqa.opensuse.org/api/v1/jobs?groupid=1&build=20210707" | jq

The tool jq is used for pretty-printing in this example but it is also useful for additional filtering (see js’s tutorial).

However, openQA’s API provides many more filters on its own. These can be used by adding additional query parameters, e.g.:

  • ids/state/result: Return only jobs with matching ID/state/result. Multiple IDs/states/results can be specified by repeating the parameter or by passing comma-separated values.

  • distri/version/build/test/arch/machine /worker_class/iso/hdd_1: Return only jobs where the job settings match the specified values like in the example above. Note that it is not possible to filter by arbitrary job settings although this list might not be complete.

  • groupid/group: Return only jobs within the job group with the specified ID/name like in the example above. These parameters are mutually exclusive, groupid has precedence.

  • latest=1: De-duplicates, so that for the same DISTRI, VERSION, BUILD, TEST, FLAVOR, ARCH and MACHINE only the latest job is returned.

  • limit/page: Limit the number of returned jobs and allow pagination, e.g. page=2&limit=10 would only show results 11-20.

  • modules/modules_result: Return only jobs which have a test module with the specified name/result.

  • before/after: Return only jobs with a job ID less/greater than the specified job ID.

  • scope=current: Returns only jobs which have not been cloned yet.

  • scope=relevant: Returns only jobs which have not been obsoleted yet and which have not been cloned yet. Clones which are still pending do not count.

Remarks

  • All parameters can be combined with each other unless stated otherwise.

  • When specifying the same parameter multiple times, only the last occurrence is taken into account.

  • All values are matched exactly, so e.g. group=openSUSE+Leap+15 returns only jobs within the group openSUSE Leap 15 but not jobs from the group openSUSE Leap 15 ARM. This applies to parameters for filtering job settings as well.

Triggering tests

Tests can be triggered over multiple ways, using openqa-clone-job, jobs post, isos post as well as retriggering existing jobs or whole media over the web UI.

Cloning existing jobs - openqa-clone-job

If one wants to recreate an existing job from any publicly available openQA instance the script openqa-clone-job can be used to copy the necessary settings and assets to another instance and schedule the test. For the test to be executed it has to be ensured that matching resources can be found, for example a worker with matching WORKER_CLASS must be registered. More details on openqa-clone-job can be found in Writing Tests.

Spawning single new jobs - jobs post

Single jobs can be spawned using the jobs post API route. All necessary settings on a job must be supplied in the API request. The "openQA client" has examples for this.

Further examples for advanced dependency handling

It is possible to spawn a single set of jobs using just one API call, e.g.:

openqa-cli api -X POST jobs TEST:0=first-job TEST:1=second-job _START_AFTER:1=0

The suffixes 0 and 1 are actually freely chosen and are merely used to specify which parameters belong to which job and how they depend on each other.

This creates a job with TEST=first-job and one with TEST=second-job and the second job will be started after the first. Of course other types of dependencies are possible as well (via _PARALLEL and _START_DIRECTLY_AFTER). Note that this kind of call will return the resulting job ID for each suffix that has been used, e.g.:

{"ids":{"0":2531,"1":2530}}

To use colons within a settings key, just add a trailing :, e.g.:

openqa-cli api -X POST jobs TEST=test KEY:WITH:COLONS:=example

Spawning multiple jobs based on templates - isos post

The most common way of spawning jobs on production instances is using the isos post API route. Based on settings for media, job groups, machines and test suites jobs are triggered based on template matching. These settings need to be defined before on the corresponding pages of the web UI (accessible to operators from the user menu). The section on job templates already explains details about these tables. Alternatively, these settings can be supplied via a YAML document.

Additionally to the necessary template matching parameters more parameters can be specified which are forwarded to all triggered jobs. There are also special parameters which only have an influence on the way the triggering itself is done. These parameters all start with a leading underscore but are set as request parameters in the same way as the other parameters.

The following scheduling parameters exist
_OBSOLETE

Obsolete jobs in older builds with same DISTRI and VERSION (The default behavior is not obsoleting). With this option jobs which are currently pending, for example scheduled or running, are cancelled when a new medium is triggered.

_DEPRIORITIZEBUILD

Setting this switch to '1' will deprioritize the unfinished jobs of old builds, and it will obsolete the jobs once the configurable limit of the priority value is reached.

_DEPRIORITIZE_LIMIT

The configurable limit of priority value up to which jobs should be deprioritized. Needs _DEPRIORITIZEBUILD. Defaults to 100.

_ONLY_OBSOLETE_SAME_BUILD

Only obsolete (or deprioritize) jobs for the same BUILD. This is useful for cases where a new build appearing does not necessarily mean existing jobs for earlier builds with the same DISTRI and VERSION are no longer interesting, but you still want to be able to re-submit jobs for a build and have existing jobs for the exact same build obsoleted. Needs _OBSOLETE.

_SKIP_CHAINED_DEPS

Do not schedule parent test suites which are specified in START_AFTER_TEST or START_DIRECTLY_AFTER_TEST.

_GROUP

Job templates not matching the given group name are ignored. Does not affect obsoletion behavior.

_GROUP_ID

Same as _GROUP but allows to specify the group directly by ID.

_PRIORITY

Sets the priority value for the new jobs (which otherwise defaults to the priority of the job template)

__…

All parameters starting with __ will not be added as job settings. Those parameters can be used to store additional information about the scheduled product itself, e.g. the URL of a web page with more context.

Example for _DEPRIORITIZEBUILD and _DEPRIORITIZE_LIMIT.

openqa-cli api -X POST isos async=0 ISO=my_iso.iso DISTRI=my_distri \
         FLAVOR=sweet ARCH=my_arch VERSION=42 BUILD=1234 \
         _DEPRIORITIZEBUILD=1 _DEPRIORITIZE_LIMIT=120 \

NOTE By default scheduling products is done synchronously within the requests, corresponding to the parameter async=0. Use async=1 to avoid possible timeouts by performing the task in background. This is recommended on big instances but means that the results (and possible errors) need to be polled via openqa-cli api isos/$scheduled_product_id.

Statistical investigation

In case issues appear sporadically and are therefore hard to reproduce it can help to trigger many more jobs on a production instance to gather more data first, for example the failure ratio.

Example of triggering 50 jobs in a development group so that the result of passed/failed jobs is counted by openQA itself on the corresponding overview page:

openqa-clone-job --skip-chained-deps --repeat=50 --within-instance \
https://openqa.opensuse.org 123456  BUILD=poo32242_investigation \
_GROUP="Test Development:openSUSE Tumbleweed"

To get an overview about the fail ratio and confidence interval of sporadically failing applications you can also use a script like this.

Defining test scenarios in YAML

Instead of relying on the tables for machines, mediums/products, test suites and job templates of the openQA instance, one can provide these definitions/settings also via a YAML document. This YAML document could be specific to a certain test distribution and stored in the same repository as those tests (making the versioning easier).

Warning
This feature is still experimental and may change in an incompatible way in future versions.

This YAML document can be specified via the scheduling parameter SCENARIO_DEFINITIONS_YAML:

openqa-cli api … -X POST isos --param-file SCENARIO_DEFINITIONS_YAML=/local/file.yaml …

This command will upload the contents of the local file /local/file.yaml to a possibly remote openQA instance. The YAML document will only be used within the scope of this particular API request. No settings are stored/altered on the openQA instance.

If the YAML document already exists on the openQA host, you can also use SCENARIO_DEFINITIONS_YAML_FILE which expects the file path of the YAML document on the openQA host.

The YAML document itself should define at least one product, machine and job template like this:

job_templates:
  create_hdd:
    settings:
      PUBLISH_HDD_1: 'example-%VERSION%-%ARCH%-%BUILD%@%MACHINE%.qcow2'
  boot_from_hdd:
    settings:
      HDD_1: 'example-%VERSION%-%ARCH%-%BUILD%@%MACHINE%.qcow2'
      START_AFTER_TEST: 'create_hdd'
      WORKER_CLASS: 'job-specific-class'

These definitions are used like their openQA-instance-wide counterparts (so continue reading the next section for more details on job templates).

Remarks

When scheduling a single test (variable TEST is specified) attempts to obsolete/deprioritize are prevented by default because this is likely not wanted. Use _FORCE_OBSOLETE or _FORCE_DEPRIORITIZEBUILD to nevertheless obsolete/deprioritize all jobs with matching DISTRI, VERSION, FLAVOR and ARCH.

Job template YAML

Job groups can be queried via the experimental REST API:

api/v1/experimental/job_templates_scheduling

The GET request will get the YAML for one or multiple groups while a POST request conversely updates the YAML for a particular group.

Two scripts using these routes can be used to import and export YAML templates:

openqa-dump-templates --json --group test > test.json
openqa-load-templates test.json

Asset handling

Multiple parameters exist to reference "assets" to be used by tests. "Assets" are essentially content that is stored by the openQA web-UI and provided to the workers. Things that are typically assets include the ISOs and other images that are tested, for example.

Some assets can also be produced by a job, sent back to the web-UI, and used by a later job (see explanation of 'storing' and 'publishing' assets, below). Assets can also be seen in the web-UI and downloaded directly (though there is a configuration option to hide some or all asset types from public view in the web-UI).

Assets may be shared between the web-UI and the workers by having them literally use a shared filesystem (this used to be the only option), or by having the workers download them from the server when needed and cache them locally. Checkout the documentation about asset caching for more on this.

Specifying assets required by a job

The following job settings are specifying that an asset is required by a job:

  • ISO (type iso)

  • ISO_n (type iso)

  • HDD_n (type hdd)

  • UEFI_PFLASH_VARS (type hdd) (in some cases, see below)

  • REPO_n (type repo)

  • ASSET_n (type other)

  • KERNEL (type other)

  • INITRD (type other)

Where you see e.g. ISO_n, that means ISO_1, ISO_2 etc. will all be treated as assets.

The values of the above parameters are expected to be the name of a file - or, in the case of REPO_n, a directory - that exists under the path /var/lib/openqa/share/factory on the openQA web-UI. That path has subdirectories for each of the asset types, and the file or directory must be in the correct subdirectory, so e.g. the file for an asset HDD_1 must be under /var/lib/openqa/share/factory/hdd. You may create a subdirectory called fixed for any asset type and place assets there (e.g. in /var/lib/openqa/share/factory/hdd/fixed for hdd-type assets): this exempts them from the automatic cleanup described in the section about asset cleanup. Non-fixed assets are always subject to the cleanup.

UEFI_PFLASH_VARS is a special case: whether it is treated as an asset depends on the value. If the value looks like an absolute path (starts with /), it will not be treated as an asset (and so the value should be an absolute path for a file which exists on the relevant worker system(s)). Otherwise, it is treated as an hdd-type asset. This allows tests to use a stock base image (like the ones provided by edk2) for a simple case, but also allows a job to upload its image on completion - including any changes made to the UEFI variables during the execution of the job - for use by a child job which needs to inherit those changes.

You can also use special suffixes to the basic parameter forms to access some special handling for assets.

The following suffixes exist:
_URL

Before starting these jobs, try to download these assets into the relevant asset directory of the openQA web-UI from trusted domains specified in /etc/openqa/openqa.ini. For e.g., ISO_1_URL=http://trusted.com/foo.iso would, if trusted.com is set as a trusted domain, cause openQA to download the file foo.iso to /var/lib/openqa/share/factory/iso and set ISO_1=foo.iso. If you set both ISO_1 and ISO_1_URL, the file pointed to by ISO_1_URL will be downloaded and renamed to the name set as ISO_1.

_DECOMPRESS_URL

Specify a compressed asset to be downloaded that will be uncompressed by openQA. For e.g. ISO_1_DECOMPRESS_URL=http://host/foo2.iso.xz will download the file foo2.iso.xz, uncompress it to foo2.iso, store it in /var/lib/openqa/share/factory/iso and set ISO_1=foo2.iso. Again, you can also set ISO_1 to change the name the file will be downloaded and uncompressed as.

Specifying assets created by a job

Jobs can upload assets to the web-UI so other jobs can used them as HDD_n and UEFI_PFLASH_VARS assets as described in the previous section.

To declare an asset to be uploaded, you can use the job settings PUBLISH_HDD_n and PUBLISH_PFLASH_VARS. For instance, if you specify PUBLISH_HDD_1=updated.qcow2, the HDD_1 disk image as it exists at the end of the test will be uploaded back to the web-UI and stored under the name updated.qcow2. Any other job can then specify HDD_1=updated.qcow2 to use this published image as its HDD_1.

Important
Assets that are already existing will be overridden. If the same asset is uploaded by multiple jobs concurrently this will lead to file corruption. So be sure to use unique names or use private assets as explained in the subsection below.
Note
Note that assets are by default only uploaded if the job completes successfully. To force publishing assets even in case of a failed job one can specify the FORCE_PUBLISH_HDD_ variable.
Note
When using this mechanism you will often also want to use the variable expansion mechanism.

Private assets

There is a mechanism to alter an asset’s file name automatically to associate it with the particular job that produced it (currently, by prepending the job ID to the filename). To make use of it, use STORE_HDD_n (instead of PUBLISH_HDD_n). Those assets can then be consumed by chained jobs. For instance, if a parent job uploads an asset via STORE_HDD_1=somename.qcow2, its children can use it via HDD_1=somename.qcow2 without having to worry about naming conflicts.

Important
This only works if the jobs uploading and consuming jobs have a chained dependency. For more on "chained" jobs, see the documentation of job dependencies.
Note
Access to private assets is not protected. Theoretically, jobs outside the chain can still access the asset by explicitly prepending the ID of the creating job.

Cleanup of assets, results and other data

The cleanup of assets, test results and certain other data is automated. That means openQA removes assets, job results and other data automatically according to configurable limits.

All cleanup jobs run within the Minion job queue, normally provided by openqa-gru.service. The dashboard for Minion jobs is accessible via the administrator menu in the web UI. Only one cleanup job can run at the same time unless concurrent is set to 1 in the [cleanup] settings of openqa.ini. Many other cleanup-related settings can be found within openqa.ini as well, e.g. the […_limits] sections contain various tweaks and allow to change certain defaults. Checkout the sub section Timers and triggers to learn more about how those jobs are triggered.

The cleanup of assets and job results (and certain other data) is happening independently of each other using different strategies and retention settings:

  • The further sub sections provide an overall description of the asset cleanup strategy and how to configure it.

  • The Basic cleanup settings section explains how to configure retentions, covering the job result cleanup as well. Also have a look at Build tagging which allows to keep certain jobs longer by marking them as important.

  • The Auditing section explains the cleanup of the audit log.

Cleanup strategy for assets

To find out whether an asset should be removed, openQA determines by which groups the asset is used. If at least one job within a certain job group is using an asset, the asset is considered to be used by that job group. If that job group is within a parent job group, the asset is considered part of that parent job group.

So an asset can belong to multiple job groups or parent job groups. The assets table which is accessible via the admin menu shows these groups for each asset and also the latest job.

While an asset might belong to multiple groups it is only accounted to the group with the highest asset limit which has still enough room to hold that asset. That basically mean that an asset is never counted twice.

If the size limit for assets of a group is exceeded, openQA will remove assets which belong to that group:

  • Assets belonging to old jobs are preferred.

  • Assets belonging to jobs which are still scheduled or running are not considered.

  • Assets which have been accounted to another group that has still space left are not considered.

Assets which do not belong to any group are removed after a configurable duration unless the files are still being updated. Keep in mind that this behavior is also enabled on local instances and affects all cloned jobs (unless cloned into a job group).

If an asset is just a symlink then only the symlink is cleaned up (but not the file or directory it points to).

'Fixed' assets - those placed in the fixed subdirectory of the relevant asset directory - are counted against the group size limit, but are never cleaned up. This is intended for things like base disk images which must always be available for a test to work. Note that relative symlinks in the regular assets directory that point into the fixed subdirectory are also preserved.

Configuring limit for assets within job groups

To configure the maximum size for the assets of a group, open 'Job groups' in the operators menu and select a group. The size limit for assets can be configured under 'Edit job group properties'. It also shows the size of assets which belong to that group and not to any other group.

The default size limit for job groups can be adjusted in the default_group_limits section of the openQA config file.

Configuring limit for groupless assets

Assets not belonging to jobs within a group are deleted automatically after a certain number of days. That duration can be adjusted by setting untracked_assets_storage_duration in the misc_limits section of the openQA config to the desired number of days.

In less trivial cases where a common limit is not enough or certain assets need more fine-grained control, patterns based on the filename can be used. The patterns are interpreted as Perl regular expressions and if a pattern matches the basename of an asset the specified duration in days will be used. In simple cases the pattern is just a match on a word.

Consider the following examples to specify custom limits that would match assets with the names testrepo-latest and openSUSE-12.3-x86_64.iso.

[assets/storage_duration]
latest = 30
openSUSE.+x86_64 = 10

Note that modifications to the file will count against the limit, so if an asset was updated within the timespan it will not be removed.

Timers and triggers

Cleanup can be triggered in different ways. One option is to use minion_task_triggers and specify tasks via on_job_done. Another way to do that is to use the systemd timers openqa-enqueue-*-cleanup to periodically run tasks. Both can be used separately or in combination.

The relevant Minion tasks are:

  • limit_assets

  • limit_audit_events

  • limit_bugs

  • limit_results_and_logs

  • limit_screenshots

These are no-ops if a task is already running so they can safely be enqueued repeatedly. Note that the tasks can still take considerable time computing what to delete, from seconds to minutes. The tasks can be enabled in the corresponding config file section.

Disabling cleanup

By default the cleanup is enabled with systemd timers if available. To completely disable cleanup make sure that no minion cleanup tasks are enabled over the config file and prevent individual or all cleanup systemd timers, for example for the asset cleanup:

systemctl mask openqa-enqueue-asset-cleanup.timer

CLI interface

Beside the daemon argument to run the actual web service the openQA startup script /usr/share/openqa/script/openqa supports further arguments.

For a full list of those commands, just invoke /usr/share/openqa/script/openqa -h. This also works for sub-commands(e.g. /usr/share/openqa/script/openqa minion -h, /usr/share/openqa/script/openqa minion job -h).

Note that prefork is only supported for the main web service but not for other services like the live view handler.

Suggested workflow for test review

If an openQA instance is only used by one or few individuals often no strict process needs to defined how openQA tests should be reviewed and how individual results should be handled. If the group of test reviewers grows openQA and the ecosystem around openQA offer some helpful features and approaches.

In particular for a big user base it is important to formalize how decisions are made and how tasks are delegated. For this structured comments on the openQA platform can be used. With a comment on openQA in the right format one can make a decision, inform automatic tools at the same time as other users and have a traceable documentation of the actions taken.

  • In openQA parent job groups can be defined with multiple job groups. This allows to segment tests for scopes of individual review teams. The parent job group overview pages as well as the central index page of openQA show "bullet list" icons that bring you directly to a combined test overview showing results from all sub groups. This allows to have queries ready like https://openqa.opensuse.org/tests/overview?groupid=1&groupid=2&groupid=3 which show all openQA test failures within the hierarchy of test results. This can be combined with the flag "todo=1" (click the "TODO" checkbox in the filter box on test overview pages) to show only tests that need review. Other combinations of queries are possible, e.g. https://openqa.opensuse.org/tests/overview?build=my-build&todo=1 to show all test results that need review for build "my-build"

  • https://github.com/os-autoinst/openqa_review can be used to produce multiple different generated reports, e.g. all tests that need review, tests that are linked to closed bugs, etc.

  • Use auto-review to handle flaky issues and even automatically retrigger according tests

  • In case of known sporadic issues that can not be fixed quickly consider automatic retries of jobs http://open.qa/docs/#_automatic_retries_of_jobs

  • In case of known non-sporadic test issues that can not be fixed quickly consider overwriting the result of jobs http://open.qa/docs/#_overwrite_result_of_job

  • For the SUSE maintenance test workflows a "branding" specific approach is provided: In case of needing to urgently release individual maintenance updates before test failures can be resolved consider instructing qem-bot, the automation validating and approving release requests based on openQA test results, to ignore individual job failures for specific incidents. See https://progress.opensuse.org/issues/95479#Suggestions for the necessary comment format or use the comment template from the openqa.suse.de comment edit window.

Where to now?

For test developers it is recommended to continue with the Test Developer Guide.