Skip to content

Latest commit

 

History

History
92 lines (60 loc) · 4.28 KB

20240501.md

File metadata and controls

92 lines (60 loc) · 4.28 KB

QEMU TCG RVV optimization: Weekly report 2024-05-01

This week is our initial kick-off meeting.

Overview

Introducing the project team

  • Paolo Savini (Embecosm)
  • Helene Chelin (Embecosm)
  • Jeremy Bennett (Embecosm)
  • Hugh O'Keeffe (Ashling)
  • Daniel Barboza (Ventana)

Work completed since last report

Nothing this week, but we'll report in future weeks against the previous week's planned activities.

Work planned for the coming week

  • WP1:
    • Pull together all the memory/string operations benchmarks and obtain baseline QEMU instruction counts.
    • Establish framework for single RVV instruction benchmarks and obtain example performance scores.
    • Obtain reference QEMU SPEC CPU 2017 instruction count baselines with and without RVV.
      • decide on whether to enable any other ISA extensions (e.g. Zvfh/Zfh) and whether to use LTO.

Detailed description of work

Proposed priorties

The Statement of Work proposes the following priorities, which trade off functionality targeted versus architectures supported.

  • vector load/store ops for x86_64 AVX
  • vector load/store ops for AArch64/Neon
  • vector integer ALU ops for x86_64 AVX
  • vector load/store ops for Intel AVX10

For each of these there will be an analysis phase and an optimization phase, leading to the following set of work packages.

  • WP0: Infrastructure
  • WP1: Analysis of vector load/store ops on x86_64 AVX
  • WP2: Optimization of vector load/store ops on x86_64 AVX
  • WP3: Analysis of vector load/store ops on AArch64/Neon
  • WP4: Optimization of vector load/store ops on AArch64/Neon
  • WP5: Analysis of integer ALU ops on x86_64 AVX
  • WP6: Optimization of integer ALU ops on x86_64 AVX
  • WP7: Analysis of vector load/store ops on Intel AVX10
  • WP8: Optimization of vector load/store ops on Intel AVX10

These priorities can be revised by agreement with RISE during the project.

WP0: Initial setup and admin

Compiler tool chains

We will generally use GCC from the pre-built RISC-V tool chains from the Embecosm website download page. Where we need to use the most up-to-date compiler, we will build our own tool Linux tool chain using the same scripts as the Embecosm website.

SPEC CPU 2017

We have SPECCPU 2017 installed and routinely measure performance in-house. In future weeks we will publish the QEMU instruction count results in this report. However given the time it takes to complete a run with vector enabled, we do not anticipate updating the full results every week.

QEMU

We have a public fork of the official QEMU github mirror, which we will use as a staging post for our development work pending submission upstream. Patches will be submitted according to the QEMU guidelines (e.g. many small compact patches are preferred over few big ones). Daniel Barboza will advice on such guidelines and help with the review process.

Statistics

This week we show only the section headings. In future weeks we shall report statistics here, in most cases showing deltas against the previous results.

Memory/string operations

Individual RVV instruction performance

SPEC CPU 2017 performance

SPEC CPU 2017 is designed to be reported using timings. We have found it clearest to report QEMU instruction counts as though they were from a machine executing 10^9 instructions per second. Then all the standard SPEC CPU scripts work seamlessly.

Actions

  • Paolo to review the generic issue from Palmer Dabbelt to identify ideas for optimization and benchmarks to reuse.
  • Daniel to advise Paolo on best practice for preparing QEMU upstream submissions.
  • The bionic benchmarks may be a useful source of small benchmarks.

Risk register

For each risk, we report the impact of it happening on the project (1 (minimal), 2 (serious) or 3 (project killer) and the likelihood of it occuring (from 1 - 10). Where the impact is 3, or where the product of impact and likelihood is > 10, we propose a mitigation strategy.

The risk register is held in a shared spreadsheet. We will keep it updated continuously and report any changes each week.

https://docs.google.com/spreadsheets/d/1mHNwGGGPJ-ls0pgCbvkSdGDoKW4vftzYWeIPPYZYfjY/edit?usp=sharing

Planned absences

Helene will be on vacation from the 1st to the 12th of May.