Minutes 2019 09 26

GPU Web 2019-09-26 New Orleans F2F Day 1

Chair: Corentin, Dean

Scribe: Austin, Kai, Ken

Location: New Orleans

Minutes for Day 2

TL;DR

Status updates:
- CW’s WebGPU Status presentation (link). Need to focus on spec and CTS.
- Apple:
  - WSL implementation can run the Babylon.js helmet demo.
  - API side mostly just worked.
  - Made a blog post about WSL and more demos (MotionMark, ComputeBoids)
- Google:
  - Progress getting WebGPU integrated on Linux, Windows and ChromeOS. Lots of features.
  - Upstreamed the CTS. 8.5kb SPIR-V assembler. SPIR-V exec env mostly implemented (validation and translation from/to Vulkan)
- Intel slides
  - Big TF.js improvements. WebML team looking at interaction between WebML and WebGPU.
- Microsoft: Lazy resource clearing in Dawn, hooking with Chromium on Windows.
- Mozilla:
  - Lots of features in the native side. Subresource tracking, swapchain, …
  - Looking to bindings and hoping to have something for EOY
Shading language: DN’s slides about how using SPIR-V for WebGPU would work.
- Clarifications about KHR / EXT extension, multi-version SPIR-V spec, opcode reservation, capabilities.
- Discussion about subsetting + extension being the same as forking or not, and exposing new HW features as extensions that aren’t in core SPIR-V
- Discussion of the concern that if it chooses SPIR-V, WebGPU shouldn’t do something just because it exists in SPIR-V but based on its own merit.
Shading language: discussion on the 3 items of disagreement Apple identified:
- 1. Apple would prefer the spec to be “one spec”
  - It would be a “higher quality outcome” and ease implementer’s job.
  - Discussion around this, and how much existing spec are “one spec”.
  - Tooling should be able to simplify a lot of the SPIR-V spec for WebGPU’s purpose.
  - Worry “one spec” discards institutional knowledge the graphics world has about SPIR-V.
- 1. Apple would prefer the group to have control and ownership over the spec
  - Discussions about upstreaming changes one direction or another.
  - Discussion around copying spec in gpuweb org and what it would imply.
  - It isn’t clear which forum is best to have the input from all hardware vendors.
- 1. Apple would strongly prefer a text-based language to a binary one.
  - Why not do SPIR-V Assembly ++ instead of WSL?
  - Suggestion to have a builtin module that takes some text-based representation of SPIR-V and returns the binary.
  - Discussion of who the target for a text-based representation is.
  - Being text-based is better all things being equal.
  - Concern that all of the graphics ecosystem will go through binary spir-v so text wouldn’t be very useful.
  - Discussion around GLSL already being a text-based language that browser ingests.
- Google still want a pinch point in SPIR-V. Would be ok putting a compiler to SPIR-V as a builtin module. Discussion around what the builtin module would ingest.
  - Suggestion to make the layered API produce an opaque blob, basically make WebGPU ingest both languages.
  - Concern that “both” will lead to varying implementation quality for each language depending on the browser.
  - Huge value in being able to piggy-back on the SPIR-V ecosystem
Shading language: discussion with Nicolas Capens of Swiftshader.
- Experience is that SPIR-V is easy to implement and unambiguous.
- Discussion about the line of code merits of SPIR-V / WSL.
Shading language discussion with Eric Berhdal from Adobe
- Has huge shader codebase that can’t be translated manually. Importance of having shallow, controllable and trustworthy transpilation stack.
- Value of low-level representation is that it is close to what drivers use internally.
- Discussion of merits of view-source.
Specialization: how to address developer using text processing to specialize shaders?
- Specialization constants aren’t enough to modify the bindings.
- Something is being investigated for SPIR-V but no detail yet.

Tentative agenda

Status updates
Which shading languages for WebGPU?
Timeline with regards to W3C TAG review(s)
Multithreading! (re:)
Swapchain images usage and API (see investigation and proposal)
Exact semantics of GPUBuffer/Texture.destroy() (see this discussion)
Defaults for Pipeline and BindGroup layouts
Out of bounds drawcalls discussion (final act? was missing MM last time)
Initial extensions we want for WebGPU v1 (texture compression, subgroup, …?)
GPUShaderStageBit.NONE seems useless #193
Review new GPULimits entries?
Vertex buffer set API and offsets (see discussion)
Spec and CTS workshop
PR burndown
[YOUR ITEM HERE]
Agenda for next meeting

Attendance

Apple
- Dean Jackson
- Fil Pizlo
- Justin Fan
- Myles C. Maxfield
- Robin Morisset
- Saam Barati
Google
- Austin Eng
- Corentin Wallez
- Dan Sinclair
- David Neto
- James Darpinian
- John Kessenich
- Kai Ninomiya
- Ken Russell
- Shrek Shao
- Ryan Harrisson
Intel
- Yunchao He
Microsoft
- Damyan Pepper
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert

Status Updates

CW: WebGPU Status Updates (link)
CW: High level structure, mostly there 90%. Missing shading language, multi-queue, un-swapchain? threading
CW: IDL is not as var as high level structure because it is dependent, but we’re about 85% Still waiting for TAP
CW: Spec 3%, CTS 8%?
CW: need to focus on writing the spec and the CTS. Would be nice if all browsers that have running implementations contributed.
DM: Implementation quality?
CW: Not relevant for this group -- that’s work we have to do on our own.
DM: Strange statement. Without implementations, there’s no point in having the group.
CW: the best work we can do on our implementations is the CTS. Best way to push quality.
DJ: do you think the CTS contributions should be similar to the spec? Just get a bunch of stuff in, and then we’ll create structure later?
KN: hard to go through a huge block of tests and make sure they cover all cases. General cleanups, we’ll do. If there’s some need, or something’s too complex, or we need more functionality, I’ll add it. Any time is good for adding tests. Have to be careful of thinking that some features are fully tested when they aren’t.
DN: Is there a plan for making sure tests are organized and we’ve actually covered everything in the spec? Can’t just throw tests in the repo.
KN: right now we’re only doing validation tests, grouped by function. Other tests, idea is that as we write the spec we want to write a test plan. Covered in last meeting. Need to make sure people are doing that - responsibility of spec editors.
MM: are you sure that “write it first in Chrome, and then write validation tests for it” is the best way of going about this?
KN: that’s not the approach. Francois has been running things against both Chrome and Safari right now. Doing it against multiple implementations and our collective knowledge of what the group intends to spec in the future. Easy enough to change tests to follow the spec. No reason we can’t update things later.
MM: Okay, you mentioned code coverage
KN: That would be later on as a sanity check for the tests, not as a driving force to write them; To make sure we don’t miss important cases. Not sufficient, but it is helpful.

Status Update - Apple

DJ: high-level status update - have an implementation, wrote a blog post about it, have some demos on it. Myles got Babylon.js demos working inside WebKit / Safari! He can talk about the actual process.
MM: The code works more or less right out of the box. A few places we had to change. We don’t support cube maps yet. We had to swap that with a 2d textures and use spherical coordinates. That was the only big thing.
DM: Did you convert all the shaders too?
MM: yes. Shaders are in GLSL, concatenated to snippets. Converted these to WSL and they more or less just worked. Didn’t have to do much to get them working.
CW: what about the API side? Babylon.js was written against Chrome. Did you have to make API changes to do what they needed?
MM: we didn’t have many changes. I have a list of them if you’re interested. The only big one was the cube map issue. It’s reassuring.
CW: That’s reassuring! Means without a spec we still have a good idea of where we’re going.
MM: Babylon.js was generating mipmaps using WebGL. In order to read back the contents of a mipmap in WebGL, you create an fbo and bind a level to it and readPixels(...). In WebGL 1, you can only bind the 0th level to an fbo, not the smaller levels. In order to read out the smaller levels, we had to work around it by generating mipmaps in software.
JG: There’s a WebGL 1 extension
CW: Sorry about that. Dawn has limitations at the time of writing the demo that make it hard to generate mipmaps using WebGPU. (No subresource tracking). We just haven’t done the necessary tracking yet.
DM: Google has this limitation, Apple has it as well. Why did we not generate them by drawing them into the mip levels?
MM: We could have. Just didn’t.
DJ: Also the code was very simple to do it the way we did it.
DJ: Flocking demo! Rewrote the shaders in WSL. The dev tools integration. I can inspect the canvas, see the shader code, change it, and have it reload the shader directly. WebInspector team added this fairly easily.
MM: For performance, we ran the MotionMark test Google ported. When you port that to Safari, we found that performance is now GPU bound instead of CPU bound. The more powerful the GPU, the more triangles you get. This is a really good sign! We’re doing the right thing and are on the right track.

Status Update - Google

CW: WebGPU is available behind a flag on Canary. In coming weeks it’ll be available in Canary on Windows.
RC: Soon, yea
CW: working on Linux as well. Requires suite of flags because Chrome doesn’t really handle having Vulkan turned on only for one subsystem. Hello, Cube is working on ChromeOS. Added a whole lot of features to Dawn. All backends at same time. Compressed textures (thanks Jiawei), dynamic buffer offsets (thanks Shaobo), render bundles. All texture formats.
DJ: all backends - Windows backend is now there?
CW: Dawn always had D3D12, Metal and Vulkan backends mostly in sync. This time the integration into Chrome is not in sync. Use IOSurface equivalents on other platforms.
AE: we do run the test harness on Linux and Windows, just don’t have presentation into the canvas working.
MM: are you going to update the webgpu.io page?
CW: yes, we will.
MM: you mentioned compressed textures, that’s not spec’ed, how did you do it?
CW: Francois or Jiawei has a pull request open. On one of the calls the WG agreed to not merge it until we have more spec text about how compressed textures integrate. Most validation in Dawn are coming from underlying APIs. Intel’s willing to contribute that part of the spec. We agreed compressed textures wouldn’t go in the main spec, but concept of compressed texture formats with blocks, etc., would be in the spec. Formats would be extensions.
RC: Android support?
CW: We haven’t looked at that at all yet.
MM: WebGPU support in Safari works on iOS.
CW: someone’s made patches to make Dawn work on iOS.
CW: Kai upstreamed the CTS harness. Francois working on tests. Have beginning of library to convert from WebGPU in native to WebGPU in JS. Don’t know if you have Hello, Triangle? It’s mostly mechanical.
DN: Dan Sinclair is writing a SPIR-V assembler in JS. 8.5 KB. https://github.com/KhronosGroup/SPIRV-Tools/tree/master/tools/sva. Graphics robust access path - injecting sandboxing on memory accesses - is landed. Corentin’s picked it up, working with it in the WebGPU CTS.
MM: where’s it landed?
DN: it was in SPIR-V tools, now it’s been pulled into Dawn.
CW: Put it in Dawn in a simple and quick way. There’s end2end testing of it in the WebGPU CTS (not landed but draft). Uncovered some bugs with std layout.. something.
DJ: does it rewrite SPIR-V to SPIR-V, then you send to SPIR-V Cross?
DN: yes.
MM: you said you’re working on a SPIR-V assembler?
DN: yes. People might find it useful. Dan wrote an assembler. SPIR working group publishes a grammar of the instructions. Enums, syntax, operands. Lot of tools downstream use that to know how to parse things. Dan took the slice of functionality that’s proposed for WebGPU and only include that in the grammar, and only the bits for an assembler. Cut-down JSON file. Projection of what SPIR WG publishes. Assembler takes text input, tokenizes, parses, emits binary.
MM: So the input is a textual form of SPIR-V and the output is a binary form?
DN: Correct. It gets most of the assembly correct. There’s one set of rules I have for him to fix.
CW: also on the shading language side: Ryan has been implementing most of WebGPU execution environment in SPIR-V validator.
DN: Taking the existing validation and doing the additional parts implied by the WebGPU execution environment. We’re keeping up with what’s written in the environment spec.
CW: you also have WebGPU <-> Vulkan flavored SPIR-V passes?
DN: not done spec’ing what we want in a transform, but have tools that go both ways. Ryan’s building that flow. Read SPIR-V in Vulkan, validate, then emit what’s acceptable for WebGPU. Would inject initializers for variables in internal address spaces. Other direction is a no-op - or at least that’s the intent. Also fuzzing all of these tools. Chromium, as part of its regular security process, fuzzes all parts of the codebase. Fuzzing these SPIR-V tools for over a year now. Parsing → validation → SPIRV-Cross
DN: invocation of SPIR-V cross is behind the validation, so we don’t track down silly bugs we won’t expose.
JG: Question about the assembler. Do you have an example of what the input looks like?
CW: section 21.1 of the Vulkan spec.
DN: or Shader Playground.
CW: Tim Jones is at Unity - Godbolt for shader compilers.
KN: Or SPIR-V Spec examples section
JG: is there a concept of standardized SPIR-V Assembly?
DN: defacto one is the one SPIRV tools uses. https://github.com/KhronosGroup/SPIRV-Tools/blob/master/docs/syntax.md
DN: ARM / Intel have been developing transcompilation between LLVM and SPIRV for about 4 years. They’re about to ditch their homegrown text format and use the SPIRV-Tools one.
JG: as a testament to how easy it is to work with SPIR-V, I made a two-panel web app to take in textual SPIR-V and emit bytecode. Mostly because of JSON schema. (https://jdashg.github.io/proto/spirv/spirv-asm.html)
DN: in Shader Playground you can choose GLSL, HLSL, etc. Will be passed through an optimizer. Redundant loads/stores are eliminated, etc.

Status Update - Intel

YH: slides
CW: TFJS improvement is because they were previously doing 1x1x1 dispatches. Turns out that drivers don’t handle this well and you should use larger local sizes. like 16x16x1 (depends on the hardware)
MM: one thing this WG has to do is pick dispatch sizes that work well.
DM: this was WebGL 2.0 Compute you’re comparing with?
YH: no, pixel shaders. For WebGPU we use compute.
MM: in your work with TF you’re focusing on Chrome and Dawn?
YH: yes, mostly. on Mac.
MM: is there a Github repo we can look at?
CW: yes, upstream at TensorFlow Github repo.
DJ: some of your colleagues presented at Web Machine Learning group at W3C last week. Intel also have a prototype impl of WebML API. Third comparison point. TF.js in JS, in WebGL, in WebGPU, and WebML. Was interesting. Not really relevant, but these devices now have ML hardware on them. Perf improvement from CPU -> GPU is significant. One step further to ML hardware, another 10x improvement for some ops. James and Kai coming up with way to write custom ops.
YH: yes. Actually Ningxin from Intel is working on this. For ML ops, we have CPU, GPU and accelerator backends. Accelerators are better than the GPU, and GPU backend is better than CPU side. Maybe we can expose custom ops via compute shaders.
JD: thinking about proposing WebGPU extension to WebML group that would let us expose GPU capabilities for machine learning. May be possible to expose other kinds of accelerators.
MM: we have opinions about this, should probably talk about this when it’s more fully formed.
JD: yes, can talk about this later.
CW: Ningxin also has a prototype where you can do interop between WebML and WebGPU. Input / output WebGPU buffers to graphs. Cool that they got this working.
DN: is the subgroup support request uniform or non-uniform?
YH: not clear.
DN: OpenCL only had uniform subgroup support. All invocations were going to compute collective operation. Vulkan 1.1 has nonuniform subgroup support. That’s something I’m interested in knowing. Also, fp16 support: is the request for vectors of fp16? Or also scalars?
YH: I can bring these back to the team and find out.
YH: Bryan and Brandon working on resource management, resource residency, render pass, etc.
CW: not to speak for them: they’ve been working on quality of Dawn’s D3D12 backend. Residency management - this is an issue on Windows, and maybe on Metal 2 too?
MM: depending on whether you’re using argument buffers or not.
CW: not at this time.
MM: or heaps.
CW: biggest work they’ve done is looking at problems around memory - residency, sub-allocation. Hard problems that touch many parts of Dawn. Getting there.
MM: in D3D, resources are by default represented as resident. If you don’t do anything it should work? Are you looking at low-memory?
CW: 3 resources. Committed (create now along with GPU allocation). These are pageable, but by default resident. For perf, all D3D guidelines say to allocate in heaps, these start non-resident by default. Discussion around getting close to memory budget.
RC: it is true that committed resources are resident by default. Video memory manager will page things out for you, but in D3D11 there was a lot more information about what you’re using and when. With bindless and other features, the driver has no clue about what’s most recently used. That’s now your job as app author.
MM: my question was mainly, are you primarily interested in windows devices that have small amounts of video memory?
RC: yes, that’s true.

Status Update - Microsoft

RC: mainly in Dawn. Lazy resource clearing. Hooking up Dawn to rest of Chromium so that these demos can be in browser, not just C++ apps. Working with Bryan / Brandon on resource / residency management.

Status Update - Mozilla

DM: mostly focused on native impl since last F2F. Big things:
- Subresource tracking. Can gen mipmaps, work with layers of shadow maps
- Robust multithreading
- Re-wrote complete swap chain for native. Can now ship on iOS. Can go through app store.
- Imp’t fixes in D3D12 backend.
- Adapters exposed from all APIs. Similar to Dawn, you’re given adapters from D3D12, Vulkan, etc. from the system.
- Exploratory work on SPIR-V manipulation. Analyzing, sanitizing, transforming it in Rust toolchains. So far looks fairly promising. For integration into Gecko, looking at bindings. Trying to get something working by end of the year.
DM: No demos today, but happy to show things on iPhone and laptop.
CW: Also, we are working on a shared C header that can be used to target both wgpu-native and Dawn and WASM WebGPU at the same time. We’re mostly agreeing on most things, except lifetime management. Looking promising.

Shading Language Discussions

CW: David has a presentation to clarify some topics.
FP: we have some points to make but not a presentation.
DN: slides
- KR: What about HLSL?
- DN: HLSL is a language; this is just about how client APIs, OpenCL, OpenGL, etc. extend and restrict SPIR-V for their use cases.
- MM: What does EXT stand for?
- JK: Some abbreviation -- external or extension?
- FP: Does that mean discussion happens outside of Khronos?
- DN: Yes.
- DM: If our group makes an extension, it would be an EXT extension that doesn’t need approval?
- DN: That’s correct.
- JK: Or it could be a “vendor” extension with an acronym for our group
- DM: So Vendor and EXT don’t have much difference?
- DN: That’s correct
- KR: I have some experience from the WebGL working group. There’s no necessity that we follow a track of vendor->EXT->KHR. It’s fine for us to operate completely independently.
- JK: Correct, you never have to follow through going down that chain. It would just be wise to publically reserve the block range of numbers. You don’t have to say what they’re used for; you just have to say WebGPU or W3C owns the numbers.
  You really don’t even have to do it, but I would advise this for interoperability with other clients.
- CW: When an extension goes into the main version of the spec, it doesn’t mean an implementation has to support it.
- DN: Right, you still need to declare a “capability” that guards the functionality, and it is not required.
- DM: Do you have to be a Khronos member to do pull requests to the various tooling places?
- DN: No, it’s all publically on Github and anyone can do it.
- FP: Under what license?
- DN: Apache 2 or other similar MIT-like or BSD-like licenses.
- JK: The update to the grammar file automatically makes the spec and assembler match each other. A lot of infrastructure automatically happens correctly. It is a very smooth process.
- KR: Does Dan’s assembler use this existing infrastructure?
- DN: It uses a slice of the grammar. It looks at it and filters it out based on what’s permitted in the WebGPU environment spec.
- MM: The SPIR-V spec says it’s version 1.5. What is the relationship between that number and the capabilities?
- DN: SPIR-V 1.0 used to be a doc, 1.1 used to be separate, etc. We found it useful to put them all together because the deltas were small. And clarifications applied to all of them.
- JK: MM is asking about numbering and capabilities. With new releases, we’ve pulled in new capabilities in the core spec. They’re not required by anything; all optional. The semantics are core but the functionality is optional.
- FP: How would you envision a scenario where something is pulled into Khronos as part of the spec, but the WebGPU group has a very different form of the same functionality because of Web safety considerations.
- JK: The groups would have reserved different number ranges. They could have similar semantics, but they would coexist separately.
- CW: The SPIR-V spec has in core a list of builtins for tessellation (Vulkan and OpenGL style). It’s gated by a capability that the WebGPU execution environment wouldn’t require support for. It could even forbid uses that want that capability. If the WebGPU group wants to do Metal-style tessellation, that would be a separate set of OpCodes. If WebGPU 2 wants the capability, then the environment spec could require it.
- KR: Does that answer the question?
- FP: Yes, yes it does.
- KR: Thought you were asking about the description of some OpCodes changing in the future.
- FP: I wasn’t specifically asking about that, but how about now you answer that?
- JK: Same number, same semantics. If you want different semantics, you have different numbers.
- FP: It is okay to have many versions of the same thing for diverging environments?
- JK: Yes.
- DN: ex.) AMD had 16-bit floating point support. It was an extension almost immediately. Over time, it was discussed within Khronos and a separate extension was defined. AMD is allowed to forever do their thing.
- FP: How often does the version rev?
- DN: 1.0 to 1.5 is from fall 2015 until now. ABout once a year.
- CW: The new version just brings extensions into the core spec, but still optional. I find it unlikely WebGPU will require any capability the day it comes out. I think everything we have in the execution environment was in SPIR-V 1.0 and hasn’t changed since.
- JK: To be clear about brining and extension into core: The capability stays the same. What changes: Before it was in core, you had to say OpExtension. In core, you either have to say OpExtension, or be using SPIR-V 1.5 (or whatever version it became core).
- MM: So if WebGPU wanted to use SPIR-V, we would name a particular version, and any extensions that Khronos came up with wouldn’t automatically add to WebGPU, and anything WebgPU came up with would not automatically go to Khronos.
- JK: Yes, it’s Core * (whitelist capabilities) + (our extensions)
- MM: So because information is flowing in neither direction, you’ve now forked.
- JK: No, there’s no fork.
- JG: I would not say that’s inevitable. It depends on how it diverges.
- MM: I can speak for Apple that we would expect to do significant engineering work here.
- JK: And those can be extensions.
- DM: Your concern is that maintaining the fork would be significant engineering?
- MM: No, I just want to be clear that this is effectively forking the language.
- JK: I think there’s a subtlety worth pointing out. If you’re public with the extension, you add tokens to the grammar file. And that happens without rev-ing the language. If not incorporated, it just says “see this extension doc” in the spec. Forking is an overstatement. With the extension it keeps dovetailing and they keep merging together in the grammar file.
- MM: In terms of functionality, they are still divergent.
- CW: Let’s say WebGPU adds special subgroup operations. You could see it as a different language, but really it’s just extended functionality -- like more standard library you can use in the shader.
- KR: The idea that it would be implicitly forking implies you’re never looking at updated versions of the SPIR-V spec again. That’s not the case. Perhaps down the line, we subsume WebGPU extensions with what became core in WebGPU.
- MM: Right, we absolutely could add new functionality that came from Khronos, assuming the lawyers are ok. When Objective-C adds a feature, Swift can choose to or not choose to use it. But no one would argue they’re the same language.
- CW: But there’s no Swift program that would be a valid Objective-C program.
- FP: WebGPU could add extending functionality that is Web-specific, that is perhaps duplicate of something Khronos added by change, with separate tokens. It really is effectively a fork. It’s as-if Objective-C and Swift would use different syntax for the same thing.
- JG: WebGL took the defacto shading languages for the APIs it was based on. And they really haven’t diverged at all. There have been extensions, but it’s fundamentally the same language. I want to make the point that the difference is a disagreement in a prediction in how much the overlap would be. My prediction is the overlap would be very large.
- FP: It’s not just a worry, but a desire to go in that direction if it’s the right thing to go for the Web. You could imagine the situation that those also in Vulkan would want the intersection is large, and those outside wouldn’t care.
- CW: Do you have an example of what you would like to do?
- DN: In the last 6 months, what has changed in the nature of the thing you’d like done. ex.) getters / setters ?
- FP: New hardware that alters our thinking about uniformity.
- MM: I think one very clear answer to CW’s question is the ability to keep data on chip. Metal’s image blocks is a feature that would clearly benefit this group.
- JG: I would be more compelled if many companies with lots of hardware and investment were not convergently developing something. If there’s novel work you want to do, SPIR-V is already a perfect and successful platform for it. Maybe you want to do development that doesn’t go upstream, but that’s fine. I think there’s a difference on the characterization of the difference. We don’t think it’s a “fork”.
- FP: It might be useful for you guys to understand our mindset. To separate the shading language disagreements as three categories.
- CW: That sounds like a second thing. Let’s finish this presentation first.
- DN: example x86 has been relatively stable for 41 years and lots of innovation on top. In practice, the evolution is quite slow relative to the evolution at the higher-level language space.
- SB: Why is that the desired outcome?
- JK: The model supports a “fork” and the model supports not “forking” for unnecessary reasons.
- FP: Okay, I understand what you’re going for and it’s a sensible proposal.
- RC: If we go down this route, a lot of browser vendors will be using SPIR-V tools. What guarantees about backwards compatibility?
- DN: Tools are embedded in all previous versions and the tests.
- RC: What has been the nature of the things that have changed? Is it more just tokens, or like if statements and functions and parameters and things.
- DN: The gap from 1.2 to 1.3 was to support Vulkan 1.1 where we added non-uniform group instructions. That was the most significant thing. OpenCL added a decoration for Constructor/Destructor. The ChangeLog at the end of the SPIR-V spec will show you the delta between versions.
- DM: The SPIR-V group will drive development in one direction, and we will drive perhaps in a different direction. And we can rebase on this. But IIUC, at no time will our work be upstreamed?
- DN: Yes.
- DM: So we are a fork?
- JG: Forks have connotations though, and I think this is different.
- JK: Fork is more like OpIAdd changes definition and they drift in different directions.
- MM: When I use the word fork I was talking about this in the wrong term. Because the communication between the two sides is potentially non-existent, in 100 years, they’re going to be different.
- DJ: Can extensions change another OpCode?
- JK: No, they’re written as-if they could be rolled into core.
- DN: Let’s say my son wants to learn programming but he’s a fool. And he writes an extension that changes OpIAdd to do multiplication instead. He’s perfectly free to do that, but we would advise them to make a different OpCode for it.
- FP: What’s interesting about the IAdd example: even if this committee literally forked SPIR-V, it seems inconceivable that we would then make a change like you suggested. Even if there were a fork, and then certainly if any browser shipped such a fork, then thereafter, we wouldn’t let others do something that changes IAdd. All the things where the forks would diverge are more subtle issues and would be unknown unknowns that we don’t know at this point. It wouldn’t be something quite so blatant as changing the definition of Add.
- DN: Right. We would hope that any real necessary change would be reasonable and be captured in tests.
- DM: Are there any examples of “long” forks of extensions?
- DN: Yes, AMD has had one since Day 3 which no one else interacts with.
- CW: Also OpenCL and Vulkan have their own “flavors” of SPIR-V. The addressing model is very different and they live happily together. We would have a flavor and not a fork of SPIR-V
- KR: Realistically, I think this group would likely roll with versions of SPIR-V forward. There would be some sort of communication
- CW: There is even the Khronos-W3C liaison who would bring feedback back to Khronos.
- MM: When we consider a feature, the argument that “it exists in other SPIR-V” is not strong. We consider it based on its merits. We’re not judging extensions based on the same criteria as they are
- CW: You mean based on inclusion or judging based on doing something different?
- MM: Both. If someone comes and would like to keep data on chip, then it is up to us to create a design for how best to do that. Maybe it will work with SPIR-V facilities, maybe it won’t.
- JG: This is fair, but this is something we’ve done in WebGL which works. I assure you it will work.
- DN: What you described in a change in programming model. Subtleties are programming model change and that’s captured in the environment specification.
- MM: When a developer writes a shader, they’re going to have to decide whether it runs on the Web or on Vulkan.
- DN: Yes.
(continued)
FP: It might be useful for you guys to understand our mindset. To separate the shading language disagreements as three categories… If we can reach consensus on even a subset then that’s great.
- 1. Where the spec lives
- 1. Whether it’s one spec
- 1. Whether that spec is text-based or a binary format.
- On (1). We would strongly prefer for the spec to live within the WEbGPU group. One way that could happen if we acknowledged we’re doing some fork and landed it within WebGPU. That would be way better for out because we’re phasing out participation in Khronos. If we want to discuss core language semantics, it is better for us if that conversation happens in WebGPU.
- On (2). We would prefer one spec. One I first started writing compilers, I was told a compiler’s job is hard so that others’ job is easy. A single spec means we create a higher quality artifact for others to use. They will have an easier time understanding if the spec is a single document rather than a delta document.
- On (3). I think this is bigger but a separate disagreement than the first two.
FP: The strawman proposal is we snapshot a particular version and then forks it.
DN: I’ll talk about (2) first. What are your thoughts there?
FP: (2) implies (1). Having one spec implies forking. If we had one spec for WebGPU describing SPIR-V, it’s syntax and semantics, for the Web. It wouldn’t mention multiple addressing modes if not appropriate, it wouldn’t mention OpUndef, etc. We would have one spec that’s easy for future generations to come at.
DN: Is this about the presentation or something else?
FP: It is about presentation, but considering how specs evolve, it’s not just about that. If you had a script to combine things, I’m not sure it would satisfy the single-spec concern.
CW: So the concern is that someone reading the spec would see OpUndef and then later see that OpUndef is not allowed?
FP: So other participants or browser vendors would have to participate and interact with the document source and not the published PDF form. It’s a higher quality outcome for the Web if you have a single spec for others to contribute to.
DN: Are you proposing WSL should include the Unicode spec? IEEE754? Who draws the line?
FP: You look at how other specs work. WebAssembly is not a delta spec on an abstract stack machine, ECMAScript, etc… They’re complete in themselves
JG: We do have some wildly successful delta specs on the Web.
KR: Would it be sufficient to have a generator to make everything one spec?
KR: IIUC, one issue is that the SPIR-V spec defines all the OpCodes. Would it be technically possible to strip them out in a separate document.
JK: Yes, that it is entirely possible. The Vulkan spec is already published that way.
FP: That’s a step, but not what I’m proposing.
KR: How is that a problem?
FP: How do participants think about extending when they have to look at a unified document, but contribute to separate documents?
KR: CSS rules, for example, are specified separately. I don’t see how this is a concern. If the group can work completely independently, then that addresses the concern.
MM: We’re not talking about extensions here, we’re mainly interested in the execution environment because it’s reductive rather than additive. CSS is additive. The execution environment is reductive.
JK: Really the environment is the whole API spec. That would be needed in WSL as well to explain.. The environment spec is mostly a whitelist. You list what is included, not what is excluded. Once you do that, it’s trivial to make a spec generator to only include that.
MM: It’s reductive in the sense that SPIR-V capabilities are larger than SPIR-V WebGPU capabilities.
JK: Right that’s why we would only display and turn on some capabilities. There should be the API spec and the language specification.
FP: The ideal web spec makes it crystal clear that a memory load only behaves in a way that obeys web security. Right now, we have “Oh there’s a load” and separately, “Here’s how the load is safe”.
FP: Javascript doesn’t say we’re using IEEE but we have to exclude these things because they would be unsafe. It doesn’t modify the add operation to do checks to make things safe.
JG: If Javascript did have that, I would prefer that it reference IEEE that it’s like that except <thing>.
FP: I disagree. If that were so, I would think languages using floating point would have included it instead of referencing IEEE.
FP: In the one spec cloned into WebGPU world, people don’t have to look into the SPIR-V document.
JG: Except for all of the graphics professionals and their institutional knowledge of the existing spec.
CW: In a vacuum, if you’re a browser implementer looking at the WebGPU spec, a single spec is ideal. However, in the real world, because there’s already a ton of knowledge around an existing spec, it’s much better to have a diffed spec so that if you happen to stumble upon new knowledge you don’t have to do the diff yourself.
FP: Your framing of this is interesting. It makes it clear that the pro of SPIR-V is institutional knowledge, and the pro of a single spec is that it’s easy to understand. We care very deeply about the one-spec thing, so if this is something you could be moved on, it would greatly reduce our resistance to your SPIR-V proposal.
DN: So who determines that it is one thing?
FP: It operates as a single document.
DN: So authorship? or reader? You’ve been talking about: Can I understand it as an implementer? a target?
FP: That’s part of it.
DN: Now you’re asserting as authorship.
FP: Anytime someone new comes to TC39 as an implementer, they often bring additions or concerns or things they’d like to change. Right now they get a single ECMAScript spec. One of the most difficult things about making changes in TC39 is that the spec is huge and modifications require many edits. However, it’s one document in one style and very tightly cross-referenced. It’s separate from the where-the-spec-lives issue - if you moved where the spec lived without solving the one-spec issue, it would still check off one of our boxes.
DN: I still don’t understand about “where the spec lives”. Url? control? commit access?
FP: All of those things.
DN: I’d rather have 2 specs than 100. Metal, for example, references C++, IEEE754, [unicode?]
FP: We aren’t proposing C++ spec deltas. We’re saying brand new language with tremendous C++ overlap. You’re right that the Metal spec in it’s form would not satisfy, and we’re not proposing it.
DN: Just putting it out there as something that appears to be successful.
MM: But not as a web spec.
DN: Right. It’s clear to me you have a higher aspiration for this web spec.
FP: Right, I’d like it to be at the level of quality that I’m proud about the WebAssembly group achieving and that TC39 was able to achieve for ECMAScript.
DN: How many specs are you proposing?
FP: Core language + semantics and syntax should be one spec. OK with std library callouts being a separate spec but would rather it not be. Would be super ideal if there’s one [...]
DN: OK, that’s one. Unicode, IEEE754?
MM: Our goal is not to decrease # documents, our goal is to increase comprehensibility.
JG: And I think we have different ideas on comprehensibility.
FP: Think it’s great we are talking about it in these precise terms.
JK: Khronos allows you to use the spec for what you want. I think given the policies the mechanisms exist. We can merge specs, make cross-linking easy, split them if we want. Let’s assume that the mechanisms exist and talk about the policies.
DM: Do the mechanisms exist to have description of the Load OpCode specified as safe, etc.?
DN: Yes. When we first started that discussion, we said that it was the responsibility of the environment specification, but it could be done inline.
KR: Inlining of the definition of OpImageTexelFetch, you can have an inline thing that specifies what is without linking to something else?
JK: Right and that’s how the Vulkan spec works today.
KR: Sounds great to me.
JK: So let’s take mechanism off the table.
FP: To be clear though, we are also asking for the whole thing to be under WebGPU. Ifdefs don’t fully satisfy what I’m looking for because someone going into the source has to deal with the ifdefs.
JG: We can trivially produce any artifacts. I don’t think it’s too hard.
FP: You can disagree, but as someone who has hacked on LaTeX with conditionals and C++ with ifdefs, I’d rather they not be there. To be able to reason about safety properties, would be ideal to start with something that doesn’t have unnecessary stuff.
KR: I don’t think that’s the characterization of the SPIR-V specification. You’re assuming the core OpCodes and definition of behaviors is garbage that the Web doesn’t want. The implication here is that none of that is useful. I would argue that that is patently not the case. You’ll want most of what’s defined there to talk about how shaders work on hardware. I think the deltas are small and can be inlined into the spec.
JK: It would also be the only ifdefs in the area. No confused pile of ifdefs. Will be the first and probably only.
MM: The execution environment is 868 lines
JG: That’s super readable.
CW: Includes a list of the valid ops.
KR: With the right tooling, that wouldn’t be there - spec would only list relevant ops.
JK: Today, to develop the Vulkan spec, you list the extensions you want included, and it puts together a novel Vulkan spec that no one has printed before.
FP: Could I suggest you guys do the thing you're suggesting and then we can look at it? I’m not promising I think it’s great, but I promise I’ll look at it. Let’s see what it looks like. Right now the execution environment reads like a separate thing to me. If you think you can make it like it doesn’t, have at it.
KR: If that were present, do you think that would address the key concern, or is there an underlying unstated concern? Policy, or contributing to a repo that isn’t owned by this group, etc.?
FP: To be clear, there is the first point about where the spec lives, which is a separate point from one-spec. Not trying to trick you - trying to separate. This point is about both how it impacts authors and how it impacts readers. So it depends on the tooling how apparent it is to an author. Sounds like you think it would be easy for authors and produce a good result, would like to see what it looks like.
DN: It’s just a Makefile.

[lunch break]

CW: We left off discussing point (2). A side point: Nicolas Capens implemented SPIR-V in SwiftShader; he said it was very easy to do.
DN: Eric Berdahl can provide feedback.
CW: But we can talk about point (1).
(Discussion about whether we should schedule a talk with Eric and Nicolas.)

Where the spec lives:

FP: Independent from others but closely related to (2) - if spec is in one place, less likely to need to be split into pieces.
FP: We think the web is enough of a paradigm shift from native that this group needs to really control, not just have partial control. In your proposed world, with the core spec living in Khronos, there is significant pressure to continue adopting work from Khronos. We don’t think that’s necessarily the best for the Web. It’s unclear what the coordination story. This could be taking some version of SPIR-V, and having a complete copy under this groups ownership -- with the understanding that it is a fork. It doesn’t have to diverge wildly, but it gives the group the flexibility we will want to have as we develop WebGPU.
JG: Certainly have no problem taking a literal snapshot and including it in WebGPU’s repo. Seems fine.
CW: If we do a fork of the spec that lives in gpuweb/ wouldn’t there be the incentive to follow what the evolution of the core SPIR-V spec because of the large ecosystem around it. There are many applications that will target native first and web second that we want to bring online.
FP: Have no objection in that world to someone (regardless of whether they participate in Khronos) suggesting in the gpuweb group that we adopt a change from Khronos.
JK: What if you do want to change something like Add->Multiply. A fork could just do that or you could do it using an extension. By doing it the extension way, you get to leverage the rich ecosystem around SPIR-V. There’s a large benefit of “forking” via the extension route.
FP: In this world, when changes are brought to this group to web SPIR-V, it would be valid for people to raise objections to anything that would break existing SPIR-V tools.
JK: This allows you to keep calling it SPIR-V, even if you add and remove stuff. If you change the semantics you cannot.
FP: To genuinely meet our requirements, we do not want to be bound by whether it can be called SPIR-V.
JK: You want to be able to continue using the ecosystem if in practice it continues to be SPIR-V
MM: The farther the languages diverge, the less true it is we can leverage existing things in SPIR-V.
FP: This by itself is fairly straightforward. Do you think it’s a non-starter? A possible thing?
DN: Reserving judgement on it.
FP: We want to participate in this group a lot, but will not participate in Khronos. You can see why it would make our life much simpler.
DN: I hear that.
DJ: Our goal wouldn’t be to diverge. I wanted to note one thing: I’m not sure that you can actually take a copy. You can definitely do what JK said, but I’m not sure about taking a copy.
DN: I am not a lawyer.
RC: I think if we do take a copy in WebGPU and we don’t keep changes in sync, then it’s a slippery slope. We have to take all the tools, etc. and we need to tell WebGPU developers to use our tools instead.
FP: Is it the case that the tools have an understanding of multiple SPIR-V versions?
DN: Yes, same code base.
FP: I don’t think we would rule just having those tools understand this other SPIR-V version. In this initial cut, the “Web SPIR-V” would be SPIR-V 1.5. It would be a non-goal to diverge, but only a goal to converge to keep the tools useful. So decisions on changes would be made on a case by case basis by this group.
RC: The devil is in the details. If we do make a copy and we decide we don’t like certain OpCodes, and we tell people to use Khronos tools which emit things we don’t like, we need to fix that.
JK: There have already been official ways to deal with that and if use them it’s fine.
FP: Even if we don’t clone the spec, we still have the same problem with restricting disallowed OpCodes, etc.
CW: David and his team have already done the work needed to investigate where we need to tighten the spec for WebGPU. The changes needed are minimal, or clarifications, or tightening. In the long term, it’s likely there wouldn’t be divergence from SPIR-V except clarification of certain OpCodes.
DJ: Another optimistic view is that the group decides that something needs to be clarified, makes the change, and it gets adopted back into SPIR-V
JK: Absolutely.
DJ: As for the tools, I’m not worried. IfWhen WebGPU is popular, the tools will support it.
FP: To me, if the spec is cloned into this group, the value of the spec being a single document by default suddenly becomes higher. I want to be clear that’s what our view is. If you have the spec cloned here, there’s little reason to have the environment specification to be separate from the document which describes what the OpCodes do. You could end up with a spec that has the compatibility you want, and have our goals of a unified spec.
DN: I can see it makes processing mechanics simpler.
KR: I’d like to make 100% sure that, say from a procedural standpoint we do a fork of the SPIRV repo into the gpuweb org, that Apple will not object to discussions about upstreaming changes into the parent repo. There are questions about IP and detailed designs that I don’t understand. If we procedurally made a fork, would be we be OK trying to do that?
MM: I’m also not a lawyer. So I also can’t say. But we philosophically would not have a problem with that.
DJ: I don’t think it’s a problem. We’re not going to object from moving something in WebGPU into SPIR-V.
FP: We would want to welcome any positive contributions to Web SPIR-V regardless of where they came from. If it’s coming from a group that has some bake time on the proposal, we have some better understanding from the start. There would be no blanket “We don’t want anything from SPIR-V”
RC: Tying the issues together: I don’t want to clone the repo, and then have every single commit need to be ported back and forth. The more we can easily keep them in sync, the better it will be.
FP: I think that the one-spec vs separate is a separate discussion. It sounds like where that discussion goes will be better informed by the makefile thing that John and David were talking about, producing a unified document etc. I think the biggest issue with taking any change is evaluating how it affects security and portability on the web. Native wants to expose everything the GPU can do, Web wants to make sure implementations never differ.
DN: Who represents the detailed hardware knowledge
FP: I think it’s important for hardware vendors to participate.
CW: We can’t expect that to happen.
DN: Would like to hear offline how you think that’s going to occur.
DJ: Isn’t it the same issue with the API in general?
DN: Yes. But it’s much more fine-grained and closer to the machine.
FP: I understand it can be hard to get them to participate. But: it would be great if they had a better understanding of the Web’s requirements, for their own sake. Even if we do none of the things I’m asking for, we have to figure out what checks to do where - huge value in having the hardware vendor input on it.
KR: Do you object to taking rollups from this community groups and going to Khornos forums and discussing it with the GPU vendors?
CW: In the past, the WebGL working group has pushed hardware vendors to implement robust buffer access. In the same way, we had this discussion with WebGPU and Vulkan WG on Tuesday. We asked them about zeroing compute shader memory. We could even more so bring this to hardware vendors if we decide to go that route.
KR: There will be a best-faith effort to get more participation from chip vendors. But would like to get a better idea of ...
FP: One of the reasons why it’s valuable for those discussions to be here, is that Apple is a hardware vendor too. If they could all have those discussions in the same forum, it would be valuable.
JK: You’re getting almost every vendor in the design of SPIR-V. It came through detailed discussions of what the hardware needs.
FP: So I guess you guys have thinking to do about if you’re actually happy about the fork idea. Is that an accurate assessment?
JK: Yes.
CW: For sure we need to think about it. It has a lot of implications.
FP: One implication is that it checks off one and maybe two of our three boxes.
KR: If thought needs to be given on the SPIR-V side, then we need to consider if Apple will in good faith not diverge.
FP: Don’t think there’s value in adopting SPIR-V and then refactoring the entire ecosystem. Think the ecosystem is the primary value of SPIR-V.
KR: Meant the spec tooling.
FP: [?]
KR: Previous conversations have not been crystal clear in terms of motivation. As long as we’re comfortable working together on the SPIR-V specifications toolchains and the ASCIIDoc source, etc. producing a single top level doc; as long as we’re comfortable working together, this seems fine.
DN: To me this is an implementation detail of the process. No problem at all about the mechanics of generating a doc. We should plan for it to change.
DJ: I don’t think that we would want to make changes to the way to how the documentation is produced. That’s not the goal. The point is to have a single spec. Maybe you do that with the existing toolchain. Maybe you make changes to it that are useful for SPIR-V as well. This is an implementation of the process. It’s not an important point to me.
DN: Let me address the fork concept. This must be taken to the Khronos board, of which Apple is a part. We cannot make this decision. The arguments need to be communicated up your chain and resolved with Apple in the Khronos board.
DJ: I’ll talk to our board member about it. I doubt he knows anything yet.
FP: I feel we’ve covered my first two points. All we have left now is text.
JG: We beat the time box!
DM: From the reader perspective, there can be one document. You said that from the editor's perspective, one doc is also nice. However, we agree that we don’t want to break tools. Wouldn’t having it as a diff be a good forcing function to ensure we don’t break?
FP: Don’t think so. For me to have more opinions on this I want to see what the one-spec looks like. Then I’ll be able to express more opinions.
DN: The output or the input?
FP: Both.
JK: The input is a project like any source code. Think I understand what you want.
KR: If the ascii doctor is marked up in a way so that the spec is fully inlined in, it would actually be trivial to produce both the unified and the diff spec from it.
FP: Let’s see what it looks like and bikeshed from there.

[break]

Talk with Eric and Nicolas?

CW: Nicolas is here. We can timebox this. Nicolas does SwiftShader, which is a software GPU. Please discuss.
NC: We implemented Vulkan mostly from scratch in roughly 9 months time. I didn’t know SPIR-V before then. We have a fully conformant CPU implementation which supports all the mandatory graphics features plus some. It is fully passing conformance and it has been a pleasure implementing it. vs. many years battling issues with GLSL. Debugging SPIR-V has been very easy, for example. In general -- helicopter view -- we have a ton of products riding on the success of Vulkan. SPIR-V is a large piece. It takes a lot of lessons from LLVM.
CW: If you have questions for Nicolas. He’s the person here who has implemented a driver and who can talk about the experience of doing so. Other IHVs probably can’t.
FP: Did you use existing SPIR-V parsing code or did you roll your own?
NC: Our own, but we used SPIRV-Tools for transformations. Parsing is trivial.
FP: Do you know roughly the lines of code transform SPIR-V into …
FP: First, what do you transform into?
NC: We transform it into “Reactor” which is an abstraction level on top of LLVM which abstracts every operation as a C++ construct via operator overloading.
CW: Though no validation.
JK: But a validator exists.
NC: Including license headers etc, 7k lines of code. Also takes care of parallelization. We emit SIMD code. We operate mostly on vectors even if it looks like a scalar.
NC: This includes things like robust buffer access which would probably be easier on a GPU. We do various checks in SwiftShader. Our Reactor code looks like an emulator, but it’s actually a JIT compiler.
FP: Our existing WSL to MSL compiler which does include all validation is 25k loc. So I could imagine that if we didn’t have to validate, it might be smaller.
FP: I think the interesting thing I often hear is the comparison between SPIR-V and GLSL. What we’re trying to achieve with WSL is a text based language that has the sort of specification clarity and smallness that an IR like SPIR-V has so that a production-quality 25kSLOC compiler can exist.
NC: That seems hard to guarantee
FP: By exlcuding lots of things that GLSL inherits implicitly.
NC: Won’t those developers ask for those in the end?
FP: Do they ask for them in SPIR-V.
NC: SPIR-V is complete for running any complete desktop application.
FP: WSL is also complete. It just doesn’t include legacy stuff from the C spec that you wouldn’t need in a shader.
FP: Your comparison is GLSL to SPIR-V. WSL is this third thing. Simple example. In C, it’s valid to have a side-effect-free expression like a+b be its own statement. But it completely screws up the complexity of parsing. So languages like C# say “nope, can’t do that”.
NC: Not an issue at all in SPIR-V.
FP: WSL achieves simplicity by not adopting things from C.
CW: Let’s get into this later and timebox this topic.
NC: ?
FP: SPIR-V and WSL are isomorphic.
NC: App developers have their language and will not rewrite anything.
FP: In the same way they can transpile to SPIR-V, they can transpile to anything else
NC: So you're suggesting using a text-based language as an IR?
FP: Don’t think it’s fair to call it high level. It doesn’t include things that could be achieved in a higher level language compiling down to WSL.
NC: Still don’t see the benefit given the strong tooling and ecosystem around SPIR-V. We have massive desktop apps… etc.
CW: Eric has time obligations so let’s move on.
EB: Apologies for coming in and dropping out. [Intro] Hands on with the hardware acceleration work in multiple Adobe products. Want to help you help us make great products.
DJ: What Web apps using WebGL does Adobe ship?
EB: To the best of my knowledge, we don’t use WebGL yet. It is a barrier for us shipping on the web. We do have several apps we do ship, but for various reasons of what we can do and technical barriers, we end up needing to do this cloud side.
DJ: But you want to use WebGPU?
EB: Yes I would love to do client-side GPU work in a web app.
DJ: Would the barriers that stop you from using WebGL be removed in WebGPU?
EB: I don’t know enough about WebGPU to answer that definitively. I would say that one of the worst things a GPU API provides is forcing me to rewrite shaders. I have a large investment in this, and it is very expensive for us to do this. It is considered a barrier to adopting a new API unless we can reuse our shader code.
DJ: And when Metal was announced, Adobe demoed on stage.
EB: And the cost of that is one we will never pay again. The reasons we did that are not ones we could replicate in another relationship. (Apologize for opaqueness, but can’t be specific in public forum.)
EB: Adobe Premier Rush was shipped on Android. It was the last major platform on which we shipped Rush for this very reason. We did not have a GPU technology we could address with our shader code [on Android]. Once we had a solution to that so we could recover our shader source, it became the enabler for that product to get out there and be successful.
DJ: So you write shaders in HLSL or something your own language, and you have a bunch of tools to convert them to different platforms?
EB: Primarily our shaders come out of the classic languages. CUDA C, OpenCL C, etc. HLSL and GLSL are specific examples of languages that do not have the expressiveness to compile our shader source.
EB: The enabler for us for Vulkan on Android was that it chose a language-neutral format.
DJ: So what you're saying is that because you were able to transpile into something Vulkan accepted, you were able to run on Vulkan. It wasn’t necessarily that it was an IR.
EB: Yes. The fact that Vulkan chose an IR allowed us to recover our shader source by having a tool to compile to it.
DJ: So the tool is the enabler, not the IR.
EB: Yes, I have preferences for various IRs. But yea if there’s a tool that allows me to use my shaders, ...
FP: I’d like to understand the part about GLSL and HLSL. My understanding is that SPIR-V logical mode can be compiled to GLSL/HLSL.
DN: In most cases, yes.
FP: In would seem if the web was saying logical mode, then it has the same semantic power as using GLSL or HLSL.
EB: There is subtlety. I am not a SPIR-V expert so would ask them [-> DN].
DN: To the point of your [Fil’s] argument, yes. There are other factors we can get to. Eric, say I asked you to inject another tool in your flow. Secondly, the level of abstraction of the target language (compiling to), how does it affect if it’s a low or high level thing? What kind of confidence do you have or predicted engineering costs?
EB: The first question, injecting another tool: it depends on the tool and how it’s injected. For example on Android, we had one tool. Take our source and compile it to the binary of the platform. There was cost to integrating a tool but the cost is manageable. We have in the past tried to create a tool chain. Engineers will see A->B and B->C and say OK we can get A->C. This is true. But there are costs: Making toolchain reliable - see it as a single tool. Transpiling from low to high to low doesn’t work if toolchain is not very strong. Finally, debuggability. The further we get from the source to the platform, the harder it is to use the platform tools.
DJ: If you are transpiling into SPIR-V then you have to be able to map back. It seems to me a benefit to have the language be the one you’re processing. But anyway, unless you rewrite tools in JavaScript, you’re going to have a bad time debugging WASM.
EB: I’ve long had debates about this, but you don’t debug in Intel microcode. I have a basic understanding of how C++ translates into assembly, so I can make that leap. I understand the argument, but I don’t buy it completely.
DJ: With good tooling, you don’t have to decompile -- you have a good debugger. It doesn’t matter if it’s assembly or English.
EB: I don’t think that is true. The tool getting there is an important part. Saying I can chain together 5 links is sufficient to get from A to B - but there are hidden costs which may be enough that I will not ship my app on that toolchain.
DJ: If preferable, you’d prefer a one-step tool.
EB: THat’s right.
DN: What’s the value of a low level representation when you're debugging and experimenting and do have a bug?
EB: I have spoken about this publicly also. One of the weird things about compiling for GPU code is that nobody really knows what the ISA is. NVIDIA, ARM, Intel, etc. Even when I’m compiling to an intermediate representation - SPIR-V, AIR, DXIL - I still don’t know what’s happening on chip. The best I can do is know what the IR does and know what it does in the machine based on the IR or the source. Knowing the IR allows us to play with the IR. When debugging on Android, we can modify the SPIR-V at the bottom level - try respelling something - and identifies a bug in the driver. In a way, not different from any other compiler in the driver. But since we are looking at something small [SPIR-V] we can reason about what the bug is. In contrast we have a long history working with drivers that consume a string, and identifying the problem and the possibilities is much more difficult. I don’t have visibility into how their compiler is working or the AST or how it ultimately is translated into the ISA. Being closer to the ISA makes it easier to debug, understand finicky behavior, and work around driver bugs in a matter of hours instead of weeks.
MM: When you mentioned drivers that accept C strings, which ones are you referring to?
EB: Over the years I’ve had this experience with OpenGL, OpenCL - CUDA not so much, because NVPTX - DirectX before DXIL.
CW: Eric needs to go soon. Any questions?
FP: Valuable to have input from Eric and Nicolas.
FP: Since quite a while ago, Apple has been proposing having WebGPU ingest a text-based format. I think it’s best to understand it in terms of the goals it tries to achieve, and then the actual specifics.
FP: One goal is text: live editing in the browser which is immediately useable. Don’t need an extra tool. A human could generate that. You can “View Source”. None of these things preclude it from being a compile target.
FP: We also consider the goals of being easy to digest and validate. Consider these common with SPIR-V. We want to tick the same boxes.
FP: We want to language to be tightly spec’ed so there’s no ambiguity of the semantics
FP: And it should be low-level. A text based language can be low-level, for an absurd example the text-based assembly version of SPIR-V.
FP: You can get these goals from a C-based language.
FP: SPIR-V has to represents the block structure, but you could have done that just as well with block-structured language.
JK: I disagree a little bit with the last point. In SPIR-V to parse a floating point number, you load the binary bits. In a language like WSL or GLSL, to properly parse is around 1k lines of code. It’s almost impossible to make the round trip between decimal text form and the binary value.
JK: Also questions about the scoping. Maybe looked in the wrong place, but the spec on GitHub seemed to not talk about scoping in detail. C++ is subtle - a nested scope can do do ‘int y, x = x’.
SB: It is defined in our spec.
FP: These are interesting points, but I think there’s already a very good existence proof of a low-level tet-based form that’s isomorphic to a binary one: asm.js. Text based languages generally have solved both the scoping problem and the unambiguous representation problem.
EB: I actually very much like where you started about goals. I think that’s an important thing. One of the goals you stated I have mixed reactions to (anti goal for me). That is View Source. As a developer, it’s a great thing. When I publish a product, I need to have something where the user CANNOT view source -- to be as difficult as possible. For example asm.js, at some level it is just Javascript, but a user can’t really understand it.
From what I understand WSL being a programming form or High levle language that someone can understand. If my code is being converted to that represetnation, it’s not changing sufficiently for me to be comfortable shipping that.
FP: Here’s an example. Say you compiled using your existing tools to SPIR-V, then compiled to WSL. You will get no more information than if you shipped SPIR-V.
EB: Then what is the value of WSL?
FP: You’re not the only [class of] customer.
EB: I understand that. But why should SPIR-V not be allowed to be consumed? Am I missing an assumption that there can be only one?
FP: I don’t think we’ve had a conversation in detail about accepting multiple forms in the browser. That’s an interesting proposal and could be considered. The observation I would make is that “View Source” is not for pretty code, but some subset of developers would find it very helpful.
EB: I don’t disagree, but there are certain parts that I would really like others to not read or find out. As long as we can get it down to a level like x86, I don’t have an issue.
FP: So if there were something equivalent to compile to SPIR-V and then compile to WSL, would that be okay?
EB: Strong maybe. There is now a second tool in my toolchain. One that I’ve qualified and know. One new one and I don’t know how they interact. If the link is strong - tools are reliable - then it would be OK.
FP: That makes perfect sense to me. It is an implicit goal of this language to allow that sort of uglification. And I should have listed it as a goal of being isomorphic to SPIR-V. I hear it would work for your use case exactly if that tool were really solid.
SB: I think there’s a false sense of security of with x86. What are your feelings on JavaScript minification?
EB: There are some minifiers that do reasonably well and obfuscating the algorithm. We are shipping web apps right now so there are sufficient tools. Beyond that I can’t make comments outside NDA.
FP: I think there are pitfalls that a source language can find itself in that makes uglification very difficult. If it’s too complex beyond what’s absolutely necessary, if you start implementing a minifier, you will surely hit corner-case bugs in the compiler. The goal with WSL is to have a language that really gets to the core of what you really need to support everything a SPIR-V or GLSL or HLSL would do and very little else. So that anyone who wants to download a browser and use shaders, they can do it from the start, and if you want to get sophisticated, you can transpile from one of the other languages.
EB: The thing I am reacting to is: I try very hard to keep my words clean and distinguish language and representation. I try to say IR over IL.
CW: Need to finish up before Fil has to leave.
EB: Let me jump to the end. “Language” makes me nervous. Languages evolve and take on new meaning. I don’t want to get myself into a hole when we want to reasonably extend the language for new use-cases. The target language must change very conservatively.
FP: Thank you. WSL is trying to be IR-like.

[break]

DJ: Chairing now. Fil, you’re leaving in 30 min.
FP: There’s 2 ways we can proceed. We can talk about where WSL landed and how. Or we can have a Q&A or debate type thing where you bring up concerns and I try to address them. Pref for latter.
CW: The details of the syntax would be debated eventually, but it’s not the concern.
FP: We would love it if the discussion turned into a list of things for the CG to change about WSL. I think our position is best expressed by our goals, and not by the current existing syntax.
CW: First remark. WSL is, by goal, 1:1 with SPIR-V. Given that, what’s the distinction from SPIR-V assembly?
FP: Great question. The biggest difference is immediate human writability. You can imagine a small amount of changes to SPIR-V to do this. Like having blocks and loops, etc.
CW: How about this. We have an 8.5kb library that implements text-based SPIR-V to binary. In SPIR-V text assembly, identifiers can be names and not numbers. Your concern is that people can come to browser, write something, and have something that do something useful.
FP: Can I make the goal more concrete? Not even in the console - favorite text editor - we’re not requiring any tooling to build out a firstap p with webgpu
CW: How about having a tiny assembler which does slightly more than what SPIR-V assembly does now? With one script tag away.
FP: I think there’s value in natively accepting that form instead of pulling something down. There’s going to be different versions of it, etc.
RM: Value in being able to interact with it via browser tools. Hard to do via tools.
DJ: Would a small library of SPIR-V to WSL be sufficient?
CW: Our experience with this is that it’s hard to make this well-specified with text based languages.
FP: hear this thought a lot from you. Two thoughts. First, have the opposite experience. Worked on Java source compiler. Worked on Java bytecode tools. Worked on JS, WASM. WSL used to have generics. We didn’t find we could implement them with the goals of having it work as an IL, so removed.
FP: Would love if your specific concerns could be rephrase in a way that makes them testable. We are claiming we are addressing those specific concerns head-on. If there’s something specific about text-based languages then we would like them enumerated so we can debate them concretely.
CW: We did produce a list of these problems a while ago. Our past experience is that parsing the text is difficult and it works in webgl today because we have a monoculture with ANGLE shader translator. Second, hardware has slight differences that make it impossible to spec something that’s 100% consistent across them. In WSL, time and time again we see things that are just not how it works. We have to bring them to the table for WSL to fix them. At this point, we may as well look at SPIR-V and find every single small thing and port it to WSL. But why do that? If we really want a text language, we should just add a builtin way to compile to SPIR-V binary. Going through SPIR-V binary guarantees we go through those constraints that come from hardware experience.
FP: I think a lot of your past concerns with WSL may have been addressed as we aggressively stripped down the language and improved the spec. Maybe that’s an exercise worth repeating. You keep saying you’ve had bad experiences with these high level languages. I agree with you. To more high level it is, the harder it is to reason about whether it matches the semantics of what you're compiling to.
DN: your claim is that it’s low level. My claim is that it’s still much higher level than SPIR-V. That difference to me introduces a lot of complexity.
FP: This can be logically reasoned through. It doesn’t have to be a matter of opinion. One way you can prove SPIR-V is meaningfully lower-level is that a conversion from SPIR-V to WSL could completely avoid some area of the WSL language. Can you give an example?
JK: ++. We would not emit ++.
FP: We can go through these.
JG: One of the concrete objects about why we talk about WSL as a higher level language than SPIR-V is the necessity of parsing it. That’s amongst the ways that makes it higher level. The parsing code for SPIR-V is 30 lines.
FP: I highly doubt that.
JG: I did that.
FP: Let’s look at the high level picture of …
...
JK: Parsing a floating point number. In SPIR-V, it’s a single line. In WSL doing it correctly is 1k lines of code.
MM: Put a strawman out there to make progress. Let’s pretend SPIR-V and replaced block annotations with {} and stuff like that. And make the browser accept that.
FP: This is what I call compromise. When talking about WSL we are concerned with the goals. Not attached to the language itself.
KR: What is the basis for your claim that WSL is bijective with respect to SPIR-V.
MM/FP: In the process of proving it. We designed it that way but still working on it.
JK: It doesn’t look that way. For example: a += b looks like syntactic sugar and it isn’t meant to be bijective.
FP: If you think the presence of a++ and a+=b are major impediments, we would be happy to remove them.
JK: It’s filled with all sorts of stuff that makes it look like C++ and not like SPIR-V.
CW: In a way, WSL started with generics, etc. Which devs definitely want. And then WSL was cut down to be closer to the hardware. Moving toward lower-level.
FP: Accurate
CW: On the other side, SPIR-V text assembly starts close to binary. We can go “up” from there. Going up like structured control flow is a language we’ll want to evolve eventually. How about this higher-level thing that is a builtin-module that takes in the nicer-to-use language and returns binary. And the browser ingests that.
FP: How about this instead. That hits the point of-- I want the browser to ingest something a human can reasonably write.
CW: What’s the value of that?
FP: Don’t need anything else that will make things slower.
NC: Game devs are all used to using tools to compile.
FP: Not the way the rest of the web works.
NC: Not the way the old web works.
FP: Current web. The current web involves a mix of JS and compiled languages.
…
FP: Here is what would meet our goals. If the browser accepted a text form of SPIR-V. Even requires the ugly forms of vecs etc. Imagine we added only the block structure and blocks around function definitions, and that’s the thing the browser accepts.
CW: You claim that it’s great because devs can open any editor and write a shader by hand, these developers will want to use higher level abstractions,.
FP: And they will transpile.
CW: So it only has value for someone using WebGPU for the first.. 2 hours.
MM: Is the feedback that we’re receiving that the language is now too low level?
JG: No, you’re perceiving the things Corentin is saying as things we think are bad. We do not think they’re that bad.
CW: Suggesting that it will either be too high level with the impetus to evolve (we don’t want). If it’s too low-level, you get a language that’s useful for a niche audience. I don’t believe there’s a good middle language. If you want a language people will be productive using, it will need to be more high level. Instead, let’s have the browser ingest a low level representation and provide something for simple tests, and this group can focus on building an amazing ecosystem with an amazing high level shading language on top.
SB: Conflating high/low with bin/text.
CW: Sorry, you’re right.
FP: We think that WSL-- WSL in its current form is a nice statement of where a nice middle ground exists. We would not be in support of dramatic evolution of WSL. We had generics and removed them. If someone made a WSL like language with those generics, we’d like it to transpile to WSL. Understand the fear that high level languages evolve to have more high-level-ness.
JG: Didn’t earlier you talked about having large things you wanted to explore?
FP: Not making language higher level - things that are necessary to expose interesting GPU features, or if there was something elsewhere in the browser that required changes.
FP: WSL’s +=, ++ aren’t that hard to support, but if they are sticking points, kill them.
NC: Do you think it’s valuable to tell everyone we have a very restrictive language, people will be frustrated not being able to += 1.
FP: For those of you transpiling, no frustration. For people writing from scratch, probably not that frustrating.
MM: Swift doesn’t have ++. FP: Ruby doesn’t have ++.
KR: I don’t know how we get to the point where we have any assurance where what has been invented from scratch, from a team that may not have a lot of experience writing GPU execution languages, is actually bijective to the functionality of SPIR-V, has encompassed all the hardware constraints, and really is the enhanced block-structured version of the exposed hardware capabilities.
FP: Don’t think it’s productive to think that way. We could come back and say GPU people don’t have the expertise to make a good language.
JG: WebGL has done that for a decade. It’s frustrating for you to dismiss our experience.
CW: We’re also not designing a new language.
JG: It’s frustrating for us to recommend something that we think is great and is well supported. But we get a WSL update every few weeks and see little effort that you have looked into our proposal. I can’t continually re-explain how it’s simpler. There’s nothing to talk about. It’s the same as it was last time we discussed it. It takes max a week to write a parser for it.
FP: I don’t think the WSL parser is that hard to write.
SB: Could probably write a WSL parser in a day.
CW: On each side of the room, there’s experts on different sides. And each things theirs is the simplest ever. I don’t think we’ll make progress there.
JG: One group has expertise with web technology for graphics.
CW: You said it’s frustrating because SPIR-V people don’t have experience with language.
FP: Just a way to express the absurdity of doing the opposite.
CW: We don’t need to. Using SPIR-V decouples the GPU stuff and the language parts. Decoupling the language development tremendously helps to make something and ship it.
DN: Thank you Fil for being very clear, expressing your values, and coming all this way.
FP: I would like to understand how much code it takes to go from SPIR-V to MSL, including everything to make it safe and do all the bounds checking. This will reveal where the complexities are.
DN: WSL does not implement the specific concerns I have about graphics semantics. It is not comparable. You have to finish WSL first.
FP: Let’s look at the comparison. If you have specific caveats, then..
DN: You need to finish WSL first.
CW: This comparison is flawed. We’re using SPIR-V cross which does much more than SPIR-V for WebGPU. It handles translating to all forms of Metal, etc. Tries to make the output human-readable. It is hard to come up with an estimate for what you’re asking. It’s in tens of thousands, not hundreds. But it’s hard to separate the Web requirements and other requirements.
[3:34] FP: OK. I have to go.
DJ: I think it would be interesting to know: Fil has said our requests and what we’re willing to compromise on. I would be interested to hear if anyone else has something they’re willing to compromise on.
KR: I think one of the advantages of SPIR-V is the ecosystem and the toolchain. The farther you diverge, the less value you derive. So I’m not going to come up with a position statement on which parts we can compromise on. I haven’t been thinking in that way. I perceive the advantages of being able to do live shader editing. There are absolute advantages.
KR: Eric just talked about how tools like OpenCL C compiler were problematic before they targeted an IR. They were getting things wrong like register allocation. … The transforms are really hard. Things got better with IRs like PTX. Will be better if we cut the compiler toolchain at the ingestion point, with SPIR-V.
DJ: Which pinch point?
KR: That of the browser ingesting SPIR-V.
DJ: OK. So if your pinch point is that it must ingest SPIR-V, there is no compromise.
KR: You asked about what compromises we have to offer. I have not been thinking about it that way. I don’t have a statement.
DJ: My point is not to put you on the spot, I’m just trying to work out how we get to the end. There’s a few ways: generally we’re consensus-based, if one person is unhappy we have to find a solution. One solution is that Apple is simply very unhappy.
CW: I think where we can say our position evolved from - is that we are willing to compromise by putting a whatever-to-spirv compiler in the browser as long as we still have the pinch point. The compiler lives in userspace. Before we said we would never ship glslang in chromium. I think we are willing to compromise this way.
KR: important semantic difference. Layered APIs have mandatory polyfills. And it can’t have superpowers that the core browser API cannot do.
CW: Concretely, what we’re suggesting is that we put spv-as.wasm or WSL.wasm or glslang.wasm shipped as a Layered API. The way this works is that you do device.createShaderModule({ code: LayeredApi.compile(string) });
DJ: I am happy to hear that. I’m willing to definitely consider that as a solution. I assume that effectively what we’re doing.. would like to make sure if we do that, we’re accepting.. has to be specified. Perhaps we document whatever it ingests very well. It’s not just because it’s a polyfill it can update every week.
CW: I think better than glslang.wasm, because we agree GLSL has quirks, I suggest this be SPIR-V assembler ++. WSL was going high → low, the assembler is low → high. We can add whatever features we want and know it’s bijective.
MM: A SPIR-V assembler ++ is a good place to start. But we would need to raise it up to a point where statements would not be in SSA and variables would have full names.
JK: What’s wrong with SSA?
MM: Humans don’t do that!
JK: Functional programming
MM: Zero shader languages are like that.
JK: Doesn’t mean it’s not higher quality to write in SSA.
CW: Have you used OCaml?
MM: No.
CW: You can do let a = 1. You can’t then do let a = 2. OTOH, you can do let ref a = 1 then a := 2.
DJ: Worth exploring. WSL is our proposal for it.
MM: I think it’s important to say that SPIR-V assembly++ (in a way that humans can write) would be acceptable.
DN: Though “can write” is subjective.
JG: That’s good feedback.
CW: We want to be clear we’re 1:1 with SPIR-V. It might end up they’re the same by construction.
MM: No opinion about starting high and going down or starting low and going up.
CW: Okay, so violent agreement.
RC: OK. I guess I’ll go.
RC: From Microsoft’s perspective, I think we’re supportive of doing a SPIR-V variant. I think the goals need to be efficient compilation, execution, and performance across all the abstraction layers. The same with the API. If we find SPIR-V constructs that perform poorly, those need to be changed via the execution environment spec, etc. That’s the high order bit for us. If we find something that disadvantages Metal or D3D.
MM: We have the same concern. It is one of the major reasons we are concerned about this group having control of the direction.
RC: If we make too many changes they will diverge and it will be bad for developers. Too few and it will not work.
CW: Anecdotally, for Metal, last week UE4 posted changes to SPIRV-Cross to make it work for them. UE4 has a major focus on performance and if this was not working for them they would not have adopted it.
MM: We would have to verify. It’s a step in the right direction.
DM: Genuinely surprised about SPIR-V assembly++. I think web developers today would just like GLSL today, and not this other thing.
KN: One perspective: If we do want a Layered API, despite being optional and polyfillable, it does have the same level of reliability of any other web platform API. It has to be something we rely on and don’t change in a backward-breaking way and never regresses content. By using something simpler, we avoid the problem of making glslang changes that introduce things we didn’t intend that break later.
KR: A possible compromise: I would be supportive of adding a primitive to the shader ingestion API that would make it easier for devtools to interop. If this could be tied into e.g. SPIR-V debug info we could solve that problem.
AE: I think it would be fairly easy to have
device.createShaderModule({ source, compileFn, })
with a compile callback which devtools calls. This would make it fairly easy for DevTools or WebInspector to provide live shader editing.
DN: John made a list.
JK: But if you take the list and address it, there will still be some left. Better to go up because you know every addition you have made and know every one is sound.
DJ: OK. Break.

JK:
It seems the implementation could not really be what is specified: scientific notation is missing: 1E-4 is not a valid literal the minus "-" is included in the literal, and it says tokens are maximal matches, meaning "A -5" should be tokenized as "IDENTIFIER LITERAL" instead of "IDENTIFIER MINUS LITERAL", which should then become a parsing error So, already, I think we have an implementation-defined language, not a specified language. Underspecified things: errors in dead code must cause compilation failure: this is under (or too broadly) specified: Note that HLSL is allowed to ignore some required semantics when a function is dead, leaving this more ambiguous in WSL than it might seem, if people thing WSL is like HLSL optimizations can't change validity, but they can change results: other than showing portability is only approximate, it doesn't say "how approximate" it must be how many digits long can a floating-point literal be? How many of those can effect precision? Note it's ~1000 of code to properly parse and turn a text-based floating-point literal into binary form. Part of getting tokenization correct in the presence of comments is in the definition of the preprocessor. Without that, this area is currently underspecified. It says "white space", but not how much or what it is. How many tokens, one or two: abc/*/de? what's the character set allowed for tokens? Can I go beyond that character set inside a comment? I didn't see scoping rules yet: at what character point does x change scope in "int x; { int y, x = x, z; }" ++ is partially defined. Question: Is the following allowed, and what is the specification for it: a[a[b]++ + ++a[b]]++? Confusing Is it really intended that "int a, b" makes 'b' the same type as 'a'? #INF or -1.#INF are in HLSL, are they dropping them in WSL? Note this is the type of thing that might accidentally be in a parser, even if not in the language, especially if a parser has HLSL ancestory. variable names and type names are discussed as if being in two different name spaces; if so that's confusing and leads to other problems, if not, the spec seems written incorrectly regarding it Missing? many things say the work on 'float', but seem they should work on both scalars and vectors

[break]

Summary:

JG: I think there may be a novel direction to pursue. It might be viable to have a Layered API with a thin but useful translation to a more human writable format to SPIR-V that the browser ingests.
DJ: That would definitely help and be valuable.
JD: The Layered API would be a compiler which makes an ArrayBuffer.
MM: Can we make it opaque? Being able to skip the SPIR-V step would be valuable for our platform.
- MM: I am suggesting an actual “Both” option.
- ...
- MM: It would be more performant for us to compile both directly to the platform code.
- CW: The fat red line for glslang wasm is for first compile. Later compiles are about 2-3ms.
- MM: First compile is important.
- JK: And fixable.
- MM: OK. What is the value of letting the author take the SPIR-V out of the ArrayBuffer?
- CW: The compromise we are willing to do is have the pinch point which is the ingestion of SPIR-V.
- KR: The point here is to use the well-defined, well-supported intermediate format.
- JG: As long as it behaves like that…
- DJ: The browser would accept SPIR-V. If it takes anything else in via layered API, then it has to be 100% compatible.
- KR: These layered APIs are specced to not add superpowers.
- RC: The opaque object would have a .asSPIRV()?
- MM: No, you could only pass it in to createShaderModule.
- CW: What you are arguing is that the code arg of ShaderModuleDescriptor can be ArrayBuffer or [opaque object returned by layered api].
- DJ: It doesn’t matter the syntax. It can still be LayredAPI.compile. The point is it can be opaquely done in the browser engine. It might be an ArrayBuffer but there’s some magic to detect what the ArrayBuffer is.
- DM: If it’s opaque how is it different from just accepting both?
- DJ: Syntactically you may still have to do LayeredAPI.compile
- CW: Browsers that don’t want to have a separate thing can actually implement it as an actual WASM module with a CDN polyfill. Safari could translate directly to MSL.
- DJ: To be clear, in this world, the spec still accepts SPIR-V.
- CW: The way it addresses Chrome’s concern of not implementing a second language, is that it forces people using the other language to do LayeredAPI.compile. And it’s an implementation detail
- KN: You can’t have a polyfill that returns an opaque object. When you use the polyfill, it still does have superpowers because it’s observably different. Maybe this is too pedantic.
- DJ: I think we should investigate. If you can retrieve the source code from a shader module, we might have to like generate the SPIR-V…. not what we want to do though.
- DJ: Myles has expressed the concern. In our investigation we can try to figure out how this might go.
DJ: Layered API was one area of investigation. The other was about “forking”. David/Corentin/others to investigate; “we” to bring it up with Khronos board.
DJ: Ken’s mention of investigating dev tools APIs. Extend to have compiler source or backward source maps.
KR: We all have limited engineering resources here. I think we have an opportunity to be the first to do something like an up-call in support for dev-tools debugging.
SB: … don’t understand why it has to be WASM …
KR: OK. A browser upcall into user code. ...
DJ: When I say the group should investigate, I mean someone should investigate and present to group. [Need a volunteer]
DJ: Microsoft’s agreement with the caveat that anything that disadvantages a native platform needs to be reinvestigated.
DJ: Let’s say we are done with shading languages for this week.
KR: I would like to address the WSL development team as a whole. Apologize for assuming the team has no GPU language experience. Appreciate the work they have done in the past years.
DJ: Done?
DM: Can we talk about specialization?
CW: Yes, I think so. Then good time to work on the agenda for tomorrow.

Specialization

DM: My main concern with SPIR-V directly is for developers to specialize their shaders. With a text-based format people can concatenate shaders. For SPIR-V, Vulkan supports specialization, but we don’t have it today in WebGPU. That’s one possible way. Another possible way is to have a higher-level language that’s preprocessed and we generate multiple SPIR-V modules that we send over. I think that’s a question that needs to be solved before we make a decision.
RC: Does WSL have a preprocessor?
MM: No preprocessor. We’ve been talking to some developers who say they rely on the preprocessor. So we are thinking we need to introduce one to achieve those use cases while as simple as possible.
JK: The preprocessor is huge.
MM: We’ll make a new one that is not.
CW: An alternative. We were talking with Sébastien Vandenberghe, the Babylon.js developer. He was telling us how they don’t use the GLSL preprocessor; they do it themselves. If you want something good enough, you can do string-pasting and replacing. They implement the preprocessor themselves.
MM: I think there’s a path forward if we keep it super simple. We really only need things like #ifdef and #else, I don’t know what the solution is, but I can say this is a use case people are looking for. And I think it’s possible to make a preprocessor that doesn’t suck.
JK: You have to specify a separate tokenization step, many rules. Comments. Comments are underspecified in WSL because it does not have a preprocessor. Things like whether it introduces a space.
MM: Comments are in the lexer.
MM: Don’t think it’s worth discussing it much at the moment. We can agree the problem exists. One of the ways is a preprocessor. If it’s not the best, we can have a better one.
CW: The preprocessor is an ugly but proven solution to the problem. Specialization constants help solve part of the same problem. Vulkan and Metal have them. But they cannot change the interface of the shader module. It’s something we have been looking at / have ideas about fixing in a nice way.
MM: We are not married to any solution.
DM: Example of where a preprocessor is not sufficient. Length of a fixed-size array. Ideally you want to change just the already-built ShaderModule and not go through the whole pipeline of recompiling everything etc. Preprocessor really has to stick through to the end.
CW: Basically, like you said, we all know it’s a problem. Preprocessor works. In the SPIR-V world it’s an iffy solution like what DM said. It prevents reusing intermediate results. We hope to have something better before WebGPU v1.
KR: I’ve talked with the Three.js and Babylon.js developers and looked at the shaders the engines produce. The ability to make decisions on how to snip the shaders together is complex and not covered by either specialization constants OR a preprocessor. (example). That’s not something we can solve with a preprocessor.
JK: I think the stuff we are looking at in SPIR-V has to be included in that conversations. We are looking at some viable solutions. One thing is an if/endif that completely skips vendor specific code in between. Allows changing interface. Another is type parameterization.
DM: Can you clarify if type specialization changes the interface points?
JK: It changes the interface within a set of rules that make it tractable.
CW: Takeaways: Specialization is important, including for interface. We need a story for WebGPU V1 via a simple preprocessor or via the work that’s being investigated currently.
KR: Another concern. Some of the directions for future investigation involve the production of an external module to translate languages, for some browsers to natively ingest SPIR-V and another format, and some browsers to not do that? Not sure how that works. The point is if we don’t collectively focus on making one ingestion format work well in all the browsers, then we’re going to have divergent implementation. The more we add and diverge, the more bugs there will be. There will be a point where one browser has a SPIR-V ingestion bug and another has a WSL ingestion bug and you can’t work around it unless you do user agent detection and have a bugfix for each. I hope we would collaborate on working on one of these working really well.
DJ: What were the proposals?
KR: The “Both” proposal.
DJ: You’re saying we shouldn’t accept Both we should only accept one?
KR: Sorry. If the ingestion were one format for all browsers with additional complexities for specialization (hypothetically). then you have the opportunity to rigorously test it. If, some browsers drive toward ingesting natively more than one format or an “other” format, then we’re increasing the likelihood of divergent implementations where each one has a bug on the other ingestion format. And users can’t get around it.
MM: What is the reason you can work around bugs in SPIR-V but not WSL?
KR: Say Safari only ingests WSL, and internally advertised support for SPIR-V is actually a translation. If we natively ingest multiple languages at the API level, especially with complex areas, some things will be less than well tested, because we’re not all focused on testing the same thing. It’s a danger I forsee by not cooperating on one ingestion format.
DJ: I think Myles’s question was why a developer can’t work around one implementation’s SPIR-V code or WSL code…?
CW: I think it was that Ken was stressing that choosing two ingestion paths, there’s the risk that implementations will focus more on one than the other.
MM: If we focus on just one, the language could be SPIR-V or something else. Would you be willing to consider the pinch point isn’t SPIR-V or is SPIR-V++? What modifications would need to be done in order for that to be acceptable?
SB: There would be nothing we could say to convince you to accept SPIR-V text?
CW: Depends. SPIR-V Assembly? maybe...
MM: I think the way to define text SPIR-V would be a language -- which this group can discuss -- whether it has ++ or is an SSA, etc. A language this group would own that would not be a binary format.
RC: We’re already talking about something that still has the WebGPU SPIR-V execution environment. So it’s still different.
KR: Not substantially different. Validation that’s a superset of the current spir-v validation.
MM: RC is right that today it’s possible to trick a glsl-producing compiler to produce something invalid for the web. They’re going to diverge and they will get farther and farther apart.
CW: And same point as before. Language is still SPIR-V. Just a flavor. Saam, the way we have phrased our concern before, is that a language that evolves is a no go. Becomes complex over time and causes browser incompatibilities. Suggestion was what if WSL is so low level it does not have this problem. We think that’s the language that’s only useful for 2 hours at the beginning. At the same time abandons the ecosystem. Need to keep it.
SB: FIrst of all I don’t completely buy the argument that… ??
JG: I think Saam said: Isn’t it still a benefit to our (2 hour) language over binary spir-v? And if it’s a benefit why are we dismissing it. Accurate characterization?
SB: That is one characterization. I don’t actually buy this is the only benefit. Let’s just say for the say of argument, why would being able to edit it in its text format not also be good? What we heard from Adobe that it was beneficial to edit the shader very manually.
JG: There is a natural benefit there if you have an easy to write language. I don’t think it’s sufficiently compelling in itself. Flip side, a binary ingestion format, dev tools can still expose text representation.
SB: I think we all agree text SPIR-V is isomorphic to binary SPIR-V. [unclear]
JG: So the tail end of what I heard is if it’s a benefit, why wouldn’t we just use a text SPIR-V.
SB: [clarified] So why not just use text spirv for the benefits of text?
JG: Part of that question is a matter of what you think the tradeoffs are for what you’re expecting to primarily ingest. Consider the opinion that the primary way that SPIR-V will be delivered in a binary format, in that case even though it’s isomorphic to text format, exclusively requiring the text format would mean that all of the binary producers that we already have an ecosystem for would need to add a fixup step to convert to the text format, and then the browser would turn that again back into a binary format.
JD: For example, if you wanted to use something like Smolv for compression, you’d have to decompress, convert to a text format, all for it to go back to a binary format.
JG: I don’t think the numbers you reported are representative of truth.
SB: Can you convince us with numbers?
JG: I’m not interested in doing that now.
JD: Can we at least agree that a text format of a float is inherently more complex than the binary?
JK: ?
CW: Assuming you’re talking about the performance numbers in the blog post. It’s great that you did the performance work to make that fast. As we said, the slowness in glslang is due to first compile.
SB: That’s not the comparison I was trying to bring up. I’m just saying: can you show me parsing SPIR-V is faster.
CW: I personally can’t, but it doesn’t matter. Because the time for parsing is dwarfed by the optimizations done when creating a render pipeline. That’s why I think the argument of performance of parsing is not important.
SB: So you should not use it as an argument against text SPIR-V.
JG: Let me clarify. One of the problems is that no one person has a monolithic voice for the SPIR-V or not SPIR-V side. Part of the problem is that advantages that some people might see, I might disagree with even though I agree on the overall concepts. What I mean is that there isn’t a unified front. It can get a little confusing when it feels like some of the requests/expectations are contradictory.
SB: Got it. Ignoring speed, other arguments?
CW: The main argument I heard from JG is that what we expect is that, even if there are multiple formats, what we expect is that web developers using GLSL will use glslang to produce binary SPIR-V. Same for HLSL users, etc. And we expect most people to be using binary SPIR-V
SB: My question is if.. why can’t they use textual SPIR-V.
CW: If the pinch point is textual SPIR-V, they will have to produce it. But doesn’t add value in our opinion. Adding the extra step is unnecessary.
SB: That seems like an assumption on the tools. They don’t have to.
JG: It is what they do now.
CW: It is an assumption based on what we expect people to do and what they are doing now.
SB: So if we do have a compiler that does that, does it matter?
CW: Matters because it adds a useless step.
JG: There’s a larger existing binary SPIR-V ecosystem than a text-based one.
MM: I don’t understand the ecosystem argument. The languages will diverge and the ecosystem will either stop working or it will change to support the Web. I believe you that it would have this extra stage. In the long term may not be true. The cost is trivial compared to allowing someone to write a shader with no dependencies.
CW: We weight these differently, and being able to write a shader with no dependencies is of very small weight for us.
SB: EVen if it’s zero, why is the binary format better?
JG: In an alternate universe where Vulkan ingested a text-based SPIR-V, we would not be suggesting binary-based SPIR-V.
KR: The advantage is around the reuse of the ecosystem and reuse of the tools. We agreed that repurposing the SPIR-V tools without substantial modification is actually a goal of this effort. There’s no text ecosystem. We would be artificially creating one if we said the browser would ingest text SPIR-V. That’s not where the advantage comes from in the developer tooling world. I think we should be focusing on getting some better high level language useable with DevTools, etc and have a single small entry into WebGPU.
SB: I actually think it would be very helpful. On our team, we used to have an assembler for WebAssembly and we switched to a text format for all of our tests.
CW: That’s nice and we strongly agree it’s useful for testing. But it’s not necessary to ingest the text format. … We can still use it in tests.
MM: Being human writeable and having an ecosystem are not at odds?
JK: The binary format is the one that’s tested, specified, and has lots of tools around it.
MM: GLSL as accepted in browsers also has an ecosystem and also is interoperable.
JK: Right, and SPIRV’s ecosystem is built around the binary form. It just seems more rational to use that one.
MM: Being binary is not a destination.
JG: Can I add some color here? I actually want to go off and think about a text-based SPIR-V, how would that fit, what would it mean. We haven’t really talked about it before. I would like to think about it. Much more disposed to that than to other proposals. I think binary SPIR-V is still probably the way to go, but my feelings are much less strong. We can fully explore this relatively new but interesting but contentious point. I think we are making progress. I don’t want people to feel frustrated by not making progress, but we are.
RC: I don’t have too strong of an opinion of binary vs text. Need to acknowledge that we can’t use existing already compiled SPIR-V shaders.
JK: Right, tools would adapt and compilers rerun.
MM: I don’t think there’s a huge repository of compiled SPIR-V where the original source is no longer available.
JK: Correct. But the point is the existing ecosystem is strong around the binary form.
MM: It’s just interesting that the argument about an ecosystem is not suggesting GLSL.
CW: It is. GLSL is an important part of that ecosystem. There is an existing GLSL-SPIRV compiler glslang. But as we know GLSL has its quirks and is not something any of us want to ingest at the browser level.
MM: But we do.
CW: Because it’s a monoculture.
SB: Why is that bad?
CW: Monoculture makes the standard weaker.
JG: Have had huge benefits from having our own WebGL implementation. Really wish we had the resources to have our own shader translator.
KR: The benefits are in crystallizing the specification. You can have a spec and a CTS but you run the risk of the CTS testing quirks of the implementation and not the spec. There are many times Jeff found problems in the tests that tested Chrome and not the spec. Having a second implementation has uncovered a lot of issues in the specification itself.
DJ: I agree that multiple implementations are great. The suggestion was that effectively ANGLE is the specification for Web GLSL. There’s no way that Jeff if he made a new impl he would uncover a bug that would require ANGLE to break existing content. The fact that existing content would break would..
JG: This has been a problem in the past.
DJ: Yes it’s terrible but has provided stability and interoperability.
JG: The place we have ended up in with stability, interoperability, and particularly portability - The more portability issues we can find early on, the faster the implementations being better. Even though sometimes you sometimes still have to make backward incompatible changes.
DJ: I agree that that does help the development.. I think Myles was making the point, if we’re talking about ecosystem, why aren’t we talking about the GLSL that ANGLE produces which is effectively.. done?
JG: We have 95% of the tools to turn Web GLSL into SPIR-V. That’s one of the benefits that SPIR-V brings. It’s not that I don’t support GLSL, but that the tools will support it.
MM: Why are you waiting for people to build tools for Web GLSL when we already accept it?
CW: The tool already exists. glslang WASM works today.
JG: Have to translate bind groups, etc. Picking one way is throwing a dart.
MM: The fact that WebGPU’s binding model matches SPIR-V’s is a happy accident.
CW: It is because it match’s Vulkan which is the most constraining of the API’s and SPIR-V was defined to work with Vulkan.
JG: We got there by convergent evolution. We looked at the APIs and looked at the form we need to support to make it work on all platforms. I don’t think it’s an accident. I think it’s convergent evolution.
MM: This whole argument doesn’t make sense because it’s demonstrable that GLSL works in Vulkan.
JG: That’s why I said 95% of GLSL to SPIR-V.
MM: But we don’t need to convert to SPIR-V!
JG: But we can’t just ingest WebGLSL into WebGPU without making some serious choices. The binding model is different.
MM: But GLSL already has a binding model that matches.
JG: GLSL 450 for Vulkan is different from what you would write in Web GLSL. It wouldn’t be compilable.
CW: One example is that in GLSL 300 es, you only have combined textures and samplers. In GLSL for Vulkan/WebGPU, you have to pass them in separately and combine them.
MM: Right, we have to change GLSL in order to make it work in WebGPU. We also have to change SPIR-V. THat’s what the execution environment is.
JG: If we were to pick one or both, I think we should just do SPIR-V.
MM: I think we should pick GLSL and have SPIR-V be a tooling problem.
JK: Execution environment spec doesn’t make the same level of changes to the language than we would need to change GLSL to match the binding model.
KN: We keep talking about Web GLSL. There’s only one implementation in ANGLE. We can’t take it out and make it work.
CW: There’s a single GLSL to Vulkan impl that is glslang. ANGLE uses this. This is the same one we distribute.
MM: Don’t understand.
KN: ANGLE shader translator compiles GLSL for OpenGL ES into HLSL, GLSL for Desktop, GLSL for OpenGL ES. It does not compile into GLSL for Vulkan, HLSL for D3D12, for MSL. This doesn’t help us.
SB: ?
KN: There are currently a dozen in-use GLSL implementations.
SB: Something about GLSL being reimplemented.
KN: Then it wouldn’t be WebGLSL and it wouldn’t be well specified.
SB: Why couldn’t it be well specified?
CW: It’s still a single implementation.
MM:
KR: Implementing a GLSL compiler multiple times has been done. It’s been well trod in the graphics industry. It let to AMD, Intel, NVIDIA, Apple, …. writing GLSL implementations.
MM: Constraints of the web are different.
KR: Great effort was expended writing conformance tests, but the reality that implementing the large language spec was complex enough that there were a variety of bugs in the various implementations. And the industry has spent time narrowing down, fixing, working around them. Many of us have the experience of trying this and we have seen what has happened in the Vulkan ecosystem by choosing a lower-level representation. It is provably simpler and has led to great benefits in the entire industry.
CW: TO add to that, we had a discussion with the Vulkan working group on Tuesday. Here is a quote from Tobias Hector (AMD, formerly Imagination, former chair of OpenGL ES). When developing Vulkan, there was resistance to giving up their compiler frontends because they thought they could compete on it.
- <quote> … SPIR-V eliminated whole classes of bugs.
CW: Understand we think we have higher standards than the native specs. But they saw huge benefits from using SPIR-V.
DM: On ecosystem. needs clarification. The benefit of SPIR-V in separate the execution environment from the language. The tools can operate on it and work automatically. We get that for free. Unless they need specific bits, they work independently of extension-specifics. This is something we wouldn’t get with GLSL. It’s very easy to build things on this foundation.
RC: Don’t know if I agree with that. I think the tools will need to change to take into our flavor. It will have to not do certain OpCodes. It’s really the tools ecosystem.
JK: True. If you add an extension, but once it’s added to a list in the optimizer, you benefit immediately. I think we need a softer approach on how intrusive the execution environment is. It might disallow things, but the environment you’re targeting is part of how you use the tools. The work necessary on the toolchain is proportional to the deviation in the execution environment. Much of it is reusable and leveraged.
RH: I want to comment specifically on modifying the tools for the environment spec because I have already done a lot of it. The code is relatively generic. You have code that says “if webgpu, add this check”. There is very little of rewriting large amounts of code. The code for webgpu is very well-contained. The vast majority of the code is reused, already well tested. It is why as a single engineer I am able to implement this. If I had to do this from scratch, it would be multiple engineering years. It’s as simple as looking at a value, if it’s a certain value and we’re in webgpu, we reject it.
DM: Composable.
JK: I’d like to go through an example of an execution environment removing an instruction. That needs to be backed up by a concrete example.
DJ: I think we are talked out. Is there anything else?
JG: ✋
JG: No.

[tomorrow]

DM: Can we start earlier?
Everyone: No.
DM: I’m leaving at 6pm.
CW: We’ll make sure to move earlier.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 2019 09 26

GPU Web 2019-09-26 New Orleans F2F Day 1

Minutes for Day 2

TL;DR

Tentative agenda

Attendance

Status Updates

Status Update - Apple

Status Update - Google

Status Update - Intel

Status Update - Microsoft

Status Update - Mozilla

Shading Language Discussions

Agenda for next meeting

Clone this wiki locally