Minutes 2020 12 14

GPU Web 2020-12-14

Chair: Corentin

Scribe: Ken and Austin

Location: Google Meet

Tentative agenda

PR burndown
- Add validation steps for createTexture and createTextureView. #1237
- Associate command encoding with a queue #1278
- Add minmax-sampling feature #1283
- Make the default value of addressMode{U,V,W} align with WebGL #1297
Multi-queue synchronization #1169 #1298
Brainstorm the list of things we want to discuss for the v1.
Agenda for next meeting

Attendance

Apple
- Dean Jackson
Google
- Austin Eng
- Brandon Jones
- Corentin Wallez
- Dan Sinclair
- James Darpinian
- Ken Russell
- Shrek Shao
Intel
- Narifumi Iwamoto
Kings Distributed Systems
- Hamada Gasmallah
Microsoft
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
Dominic Cerisano
Timo de Kort

Administrative

Which meetings do we want to cancel for end of year holidays?
- Cancel 12/28.
- 12/21?
  - DJ: I think we should cancel one of the ones on either side.
- Google has 1/4/2021 off.
- DM / KN: 12/21 works for us.
- Cancelling 12/28 and 1/4/2021. Cancel the WGSL meetings the days afterwards too.

PR burndown

Add validation steps for createTexture and createTextureView. #1237
- JG: no updates unfortunately. Can provide some for next meeting.
Associate command encoding with a queue #1278
- Prep for multi-queue.
- DM: turned out to be a little controversial. We want some sort of change for this for the MVP. Changes how people create CommandEncoders. Conflicts:
  - Does a queue create the encoder, or a device?
  - Interacting with queue families in any way? May need to block on whether we need queue families.
- KN: arguments for putting it on the queue? Right now it’s a field in teh CommandEncoderDescriptor.
- DM: right now in WebGPU we try to put methods on child objects.
- KN: not clear right now whether it’ll be a child object or not. Possible that if we add queue families, we won’t want it to be a child object. So createEncoder shouldn’t be on the queue. If we make that decision now then we can just add this member later when we add multi-queue.
- DM: so you argue we don’t need to make any changes now.
- KN: yes, if we can agree where to keep the method.
- DM: SG.
- JG: sounds basically fine. If we had queue.createEncoder and later wanted queue families, lot of things wouldn’t need changes to make that work. When you make it from a queue, it’s valid for that queue or any one in its family. Works both ways. Nicer from API architecture to hang things off child objects.
- KN: agree it wouldn’t be much of a problem. If we have queue families, though, wouldn’t really be on a child object.
- JG: if it were hanging off the queue, you wouldn’t need to mix the device in there at all, because device is a parent of the queue. No strong feelings, but slight preference for Queue.create().
- DM: can have discussion for queue families. For now I agree with Kai.
- KN: OK, can call this part of MQ discussion.
- CW: one more argument for keeping creation on device. Most users won’t use multi-queue. Simpler to say I create a command encoder and use it. Mental model, it’s a child of device.
- JG: not healthy to give them that mental model if it isn’t real. Viable to hang it off the queue, everything makes sense. Has nice benefits. Feeding into someone’s misconceptions about what’s a child of what is unhealthy in the long term. People get confused.
- CW: ok. This is deferred.
Add minmax-sampling feature #1283
- DM: we have minmax-blending. This is different: for sampling. Lets you get min or max of texels in sampling operation. REduces bandwidth. Trying to build hierarchical bounds map of depth. Many apps do this. Much back-and-forth with Kai and Corentin (thanks for feedback).
- DM: end result: just a few formats we can open up to this. R8 unorm, R8 snorm, and depth formats.
- DM: with this way, and no ability to discover more formats used with minmax, seems limited use. Fine if we postpone until after MVP.
- DM: would also be nice if we could ask Unity, do you need this feature at all?
- KN: long ago we talked about more fine grained texture capabilities. Rafael said this was necessary for D3D - can’t expose “you can use this feature with this format, and this feature with this other format” - being able to query features for different formats needed to know what we can allow for these depth formats, etc. Not sure the best way to do this. Ideal - as little variation as possible between machines.
- CW: don’t know how to do that, aside from per-format caps query.
- CW: this is relatively niche feature - Metal doesn’t have it for example - maybe we can postpone after V1. When someone yells at us, we can reconsider.
- DM: also, technically an extension. If extensions can be rolled out without group consensus if they aren’t named, then we can let users try it out.
  - JG: theoretically allowed to do that, but have agreed we won’t do that, similar to web API experimentation.
  - CW: otherwise it’s like ‘webkit-rounded-corner’ properties from 2012. Can experiment behind flags. Good taste to see with the group before shipping without a flag.
  - DM: we’ll experiment with it in native land for a while.
- CW: consensus: don’t touch this for a while.
Make the default value of addressMode{U,V,W} in sampler descriptor align with WebGL #1297
- More discussion in https://github.com/gpuweb/gpuweb/issues/1296
- JG: Jiawei did comment on the issue. Found when they were porting WebGL to WebGPU. If you don’t see texture wrap modes in WebGL, you might use WebGPU’s defaults, which are different (clamp-to-edge vs. repeat). No defaults in some of the APIs, though if you 0-init your data structures, you get certain behavior.
  - Most textures not designed to be sampled out of bounds anyway.
- CW: if you have 3D model and UV coords in model, won’t you want clamping so you don’t get pixel bleeding?
  - JG: no. For UV mapping all samples are in 0..1 bound.
  - DM: you get anisotropic filtering (...) sounds like CLAMP is still preferred.
  - CW: think so. If we default to REPEAT, people will probably have weird issues at border of UV mapping. If we do clamp-to-edge and they wanted repeat, it’ll be clear what happened.
  - JG: relatively clear, anyway.
  - CW: it wasn’t in Jiawei’s case because the texture was for caustics, so hard to see what the texture was.
  - KN: too bad we don’t have clamp to border color. Seems like most useful choice.
- DM: do we know why clamp-to-border is not an option?
  - JG: macOS before 10.12 doesn’t support it.
  - KN: before 10.12 is not a strong restriction.
  - CW: clamp-to-border-color is not on iOS at all.
  - DM: sure about iOS? Says it’s in iOS 14.
  - DJ: if it’s supported in iOS 14 we can assume it’s always there. We won’t port WebGPU to earlier iOS releases.
  - CW: not sure we want to drop macOS 10.11.
  - JG: we should reevaluate, now that iOS has it.
  - CW: Chrome still has a lot of users on 10.11.
  - JG: we’re going to want to ship WebGPU on those older drivers?
  - CW: need to talk more internally about this.

Brainstorm the list of things we want to discuss for v1 on the API side.

Web platform interop (HTMLImageElement, Canvas, HTMLVideoElement, WebCodec) *
*
*
*
*
Specialization constants
- CW: how important? Something we want?
- DM: gating it would mean people won’t need to recreate their shader modules if they change something. Need to write less code when porting from Vulkan or Metal, because they have specialization constants. Seems like a win. Doesn’t require much from us to support it.
- CW: with DXIL support do we know whether it’ll be useful for D3D12 too? If browser emitted DXIL directly, would it be possible to have some form of spec constants?
- RC: Under the covers, seems like we would have to do what the web developer would do if they don’t have it.
- CW: maybe we can prioritize things. Want to finish things, so focus on adapter query, powerPreference, etc.
- JG: fall under “fingerprinting and what we can expose about things”. If fingerprinting wasn’t a concern we could expose adapters, limits, and then it’s clear what you need to do. The main tricky part to adapter query stuff. How much extra work do we need to do to build the API?
- CW: basically polishing up various things.
Texture format reinterpretation
- CW: I think the group agreed on this; we just haven’t put it in the spec yet
Sparse texture support *
*
*
Bindless
- CW: we have a list of features internally to consider. Bindless, mesh shaders, raytracing, memory aliasing.
Web Worker support
- DM: other gaps in the spec w.r.t this?
- KN: it’s underspecified, but I don’t think we’re missing anything huge except big open question of getting things between threads synchronously.
- CW: more constraints to add buffer mapping. Can’t have ArrayBuffer open in multiple threads, because can’t detach in different worker.
- KN: have to make our ArrayBuffers non-transferable.
- CW: little polish to be done. Want to send any objects across threads except encoders. Little more semantics to specify. More of ECMAScript question than WebGPU spec thing.
- KN: don’t want to worry about sync transfer thing right now.
JG: should probably just have a total list of features and decide V1/not, and figure out whether everyone has agreement on that prioritization.
- CW: do you think we should do something like in Khronos/Vulkan, a spreadsheet where people can express interest in features individually?
- JG: it’s a good idea because we have a bunch of things to ask about and a lot of opinions. Ultimately just need a list.
- CW: I can start making a spreadsheet with what we have, and try to do Khronos’ thing of per-company columns for Want/DoNotWant in V1/MVP/Ever.
- CW: on our side we strongly want good DOM interop.
RenderPipelineDescriptor rename / reorder..?
- DM: work-in-progress in PR already. Might as well bring to conclusion.
Device creation / loss (Kai had some stuff there) *
*
Adapter query like perf caveat, best for a given convasContext, allowComputeOnly? *
*
*
Refine multi-worker semantics
1ShaderModule = 1MTLLibrary
MQ

Multi-queue synchronization #1169 #1298

CW: last time: can we afford to do MQ in MVP or not? I made a list of topics in #1298 to describe all the topics that would need to be resolved. Not sure we need to go through the list here. Biggest topics:
- How we handle concurrent access to resources. #1169 has a solution for exclusive ownership on queues, but not concurrent.
- How do we discover request/get the queues on the system, on adapter and device creation? Lot of small stuff.
DM: thing I’m missing most - great list BTW - is what things can be bolted on top of what we ship today, and what things require rearchitecture. E.g., queue priorities - can be bolted on top. Concurrent access - unsure whether it can be put on later. If we filter out the list of required things for MQ, I bet there’ll be just a few things on there.
- CW: not sure. Is the question, would you have liked for each item a notion of how much breaking change it would require from the current state of the API?
- KR: DM could you help Corentin add that information?
- DM: yes. Can work with CW to make a table. Discovery mechanism is a blocker. I said I’d do a PR for that for the next meeting.
- CW: that’s one. I think concurrent access can be very structuring. Risky to move forward without it. It touches the synchronization, which has been a very touchy point up to now.
- DM: is that going to be more difficult than saying that your destination queue can be undefined, so you’re transitioning into read-only access for all queues?
- CW: part of it. Transitioning from exclusive ownership, need to properly wait. Difficulty is other side. From concurrent access to exclusive ownership. Means that queue receiving exclusive ownership, in backing APIs has to wait on every queue that’s used resource concurrently. #1169 has no way to express that currently. Going into concurrent access = easy, going out of it = hard.
- DM: I see.
- CW: that’s why this one is a bit scary. Should figure out to do with buffer mapping too.
- CW: pipelines have to declare what queue family they’ll be used with. Resources starting with exclusive ownership on different queue. These are bolt-ons.
- DM: OK, so 3 things are the major blockers.
- CW: yes.
CW: I still feel that the way WebGPU is in the spec today can be extended to MQ, as long as in enough descriptors we’re willing to say optional queue / queue family that defaults to default queue/family, and when you want to use MQ, you can say something there. Pretty sure WebGPU today can be extended that way. As Jeff said, maybe not the nicest or most idiomatic - can’t say Queue.createCommandEncoder - but can say Device.createCommandEncoder(queue). Backward compatible.
DM: might be some nuances with how fields are named/placed. If we have other queues, will we want separate dictionary where all the queues are, but now stuck with it on the device?
CW: we could name default queue “Queue”. 2 levels of WebGPU - I don’t care about MQ, all I care about is Device.queue. And, I’m an expert developer, want MQ - do all the additional things. Almost two different development levels. Not feature levels. Still pretty disjoint. Could call the default queue “queue”.
JG: I like that.
CW: 90% of WebGPU developers - not 90% of importance to users who use the web sites - would just use Device.queue.
- RC: I agree with that.
CW: if we go that route, there’s MQ level 0 usage of the API - single-queue usage of the API. Today’s API is that. Can be extended in nice-ish way into MQ API. This is why I think we shouldn’t do MQ now.
DM: I think I agree with the fact that we can add it later. Whether we should be doing it now is more difficult, more a question of resources.
CW: ties back to previous item. Prioritization, etc. of what this group does for V1.
CW: not sure we can do much more on this topic today.

Agenda for next meeting

CW will do brainstorm sheet tomorrow and send to the group. Can put items up until Friday, vote between Friday-Monday, see what it looks like on Monday.
DM: check on #1293. Has API change, some things moved into rasterizer state from render pipeline state. If group OK with that, can proceed with editors, see what words need to be written.
- CW: PR is large. What things got moved?
- DM: we had sample count, sample mask, and alpha-to-coverage-enabled in RenderPipelineDescriptor. Now in RasterizationStateDescriptor.
- CW: think we can do PR review right now. Any concern?
- KN: seems fine. Don’t have a great mental model of the pipeline.
- DM: trying to describe how it works. :) Reordering fields in the way we spec the pipeline. Vertex state, vertex stage, primitive topology, …, color output.
- CW: sounds good. Unless concern now, can be resolved in editors’ meeting.
- CW: let’s assume we do that for this PR. If agenda is just the brainstorm, how about we do the same request for WGSL? things we want to consider for MVP / V1. One meeting where we do brainstorm for WebGPU API, and shading language. Then don’t take time out of WGSL meeting./
- DM: sounds excellent, esp. if we might cancel WGSL meeting.
- CW: let’s work with Dan Sinclair to work on this spreadsheet tomorrow.
KN: opened the one about majorPerformanceCaveat. Wasn’t enough for this meeting. Should be pretty simple. Take a look, maybe can merge immediately next meeting.
KN: Dean, do we have approval on the adapter limits?
- DJ: I think so. MM didn’t comment?
- KN: nope.
- DJ: that was the agreement.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 2020 12 14

GPU Web 2020-12-14

Tentative agenda

Attendance

Administrative

PR burndown

Brainstorm the list of things we want to discuss for v1 on the API side.

Multi-queue synchronization #1169 #1298

Agenda for next meeting

Clone this wiki locally