[WIP] Add lower-level runner API to support non-browser runners #524

thibaudcolas · 2020-05-25T01:07:12Z

Initially discussed on Slack. Currently just a draft PR with lots of missing parts.

Pa11y supports custom runners, but those runners are only executed within the context of the browser (with Puppeteer’s page.evaluate). I would like to make runners that rely on "server-side" processing, either Node modules that cannot run in the browser, or CLI tools from other languages that can only be integrated with Node code.

This PR reworks the runners’ integration to introduce a new lower-level API that supports the scenario described above, as well as making it possible for runners to manipulate the Puppeteer page if they want to – e.g. to read the page’s accessibility tree, or see if the page has horizontal scrolling on smaller viewport sizes.

This implementation does two things:

Moves the logic of executing individual runners and collating their issues from the browser context to Node. This change alone should be completely backwards-compatible and not impact Pa11y’s behavior.
Introduces a new "processPage" method for runners, which is invoked instead of run when present, and gives runners access to the Puppeteer page, expecting an array of issues back.

I’m not particularly convinced this is the best abstraction / API, but it’s the one that was the least work to get experimenting with.

There are a few things I haven’t touched yet and would likely need to address in some way:

loadRunnerScript still executes even if the runner has no scripts nor run method.
- Because of this, the runner still needs to have an empty scripts: [], and run: () => {}.
- assertReporterCompatibility only runs within loadRunnerScript. If loadRunnerScript is changed, that compatibility check would need to be done elsewhere.
All of the issue filtering logic is still in the Pa11y browser runner context only, and it’s unclear how compatible it would be with runners that aren’t in the browser,
- Retrieving the context and selector as Pa11y does relies on the runner providing a DOM element.
- The ignore by code only works for runners that have codes :)
- The rootElement filtering depends on the DOM
- hideElements also depends on the DOM

To try this out, here are install instructions with the runner I’m currently working on based on the v.Nu HTML validator:

# On macOS,
brew install vnu
# Elsewhere, grab the latest release from
# https://github.com/validator/validator/releases/latest.
# Make sure the `vnu` executable is in your PATH.
# Then,
npm install -g https://github.com/thibaudcolas/pa11y#feature/node-runners pa11y-runner-vnu

pa11y --runner vnu https://www.example.com/

Output

$ pa11y --runner vnu https://www.example.com/

Welcome to Pa11y

 > Running Pa11y on URL https://www.example.com/

Results for URL: https://www.example.com/

 • Error: A document must not include both a “meta” element with an “http-equiv” attribute whose value is “content-type”, and a “meta” element with a “charset” attribute.
   ├── html-validation
   ├──
   └── f-8"> <meta http-equiv="Content-type" content="text/html; charset=utf-8"> <

 • Notice:  The “type” attribute for the “style” element is not needed and should be omitted.
   ├── html-validation
   ├──
   └── e=1"> <style type="text/css"> b

 • Notice: Consider adding a “lang” attribute to the “html” start tag to declare the language of this document.
   ├── html-validation
   ├──
   └── TYPE html><html><head>

1 Errors
2 Notices

Here is the source of that runner: pa11y-runner-vnu, and extra install instructions. If you don’t want to install vnu, here is a much simpler dummy runner you can try this with:

module.exports = {
  // Update this when an API-compatible Pa11y gets released.
  supports: "^6.0.0-alpha || ^6.0.0-beta",
  // Needs to be defined even if this runner does not rely on any scripts.
  scripts: [],
  // needs to be defined even though it’s empty.
  run: () => {},
  processPage: async (page) => {
    const html = await page.content();

    if (!html.includes("<html lang=")) {
      return [
        {
          code: "html-has-lang",
          message: "<html> elements must have a `lang` attribute",
          type: "error",
          context: "",
          selector: "",
        },
      ];
    }

    return [];
  },
};

thibaudcolas · 2020-06-23T15:29:35Z

Quick update a month on – this is working quite well for me, with pa11y-runner-vnu, and I’ve also tried a few others since – pa11y-runner-htmlhint (a general-purpose HTML linter), and pa11y-runner-curlylint (an experimental templates linter I’m working on).

I have more accessibility auditing work scheduled in about a month from now so will hopefully be able to invest more time into this then, particularly looking at:

De-duplicating issues when using so many different runners (I’d want to use Axe, HTML CS, V.Nu, curlylint all together)
Creating meaningful reports with issues from different runners. Currently I’m doing reports by hand, but if I manage to run all of my automated testing tools with Pa11y then I might as well have Pa11y produce the reports too.

In the meantime, I’d be interested to get feedback on this concept and the actual implementation – I’ve left the PR in a WIP/draft state but I’d be more than happy to finish it and add tests if it has a chance of getting merged.

josebolos · 2020-06-23T16:40:12Z

Hi @thibaudcolas

That's excellent news. I'm currently working on some refactoring of pa11y. If you don't mind giving me a few more days, I'll have a look at this PR so then we can discuss what would be the best way forward.

Thanks again!

joeyciechanowicz · 2020-06-26T07:31:29Z

Right, finally got round to looking at this!

This is some great thinking and work you've put into this @thibaudcolas
I'm all up for this idea of allowing a runner to either run within the page, or to run on the page context.
At its core I think pa11y should be a way of running a series of tests against a page and collating the results into a common form for easy reporting. With that in mind I think we can do some things to future proof ourselves a bit.

Thinking quickly:

we should give runners the page and browser context so it can do anything there it needs to (i.e. CDP stuff)
remove the need to provide empty elements/functions in the runner config
fix some namings so it's clearer what the functions now do (within pa11y).

All in all, I think this is a good step in the right direction and get's a 👍 from me.

thibaudcolas · 2020-07-02T10:01:13Z

Thank you @joeyciechanowicz! I’ll work on a second draft, and start writing tests for all of this.

thibaudcolas added 2 commits May 24, 2020 17:18

Refactor Pa11y run to evaluate each runner separately

379449c

Implement low-level runner#processPage API

5b45520

danyalaytekin added the type: enhancement label Mar 15, 2024

danyalaytekin changed the title ~~WIP: Add lower-level runner API to support non-browser runners~~ [WIP] Add lower-level runner API to support non-browser runners Mar 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add lower-level runner API to support non-browser runners #524

[WIP] Add lower-level runner API to support non-browser runners #524

thibaudcolas commented May 25, 2020

thibaudcolas commented Jun 23, 2020 •

edited

josebolos commented Jun 23, 2020

joeyciechanowicz commented Jun 26, 2020

thibaudcolas commented Jul 2, 2020

[WIP] Add lower-level runner API to support non-browser runners #524

Are you sure you want to change the base?

[WIP] Add lower-level runner API to support non-browser runners #524

Conversation

thibaudcolas commented May 25, 2020

thibaudcolas commented Jun 23, 2020 • edited

josebolos commented Jun 23, 2020

joeyciechanowicz commented Jun 26, 2020

thibaudcolas commented Jul 2, 2020

thibaudcolas commented Jun 23, 2020 •

edited