GPU Web 2020 07 06

Chair: Corentin

Scribe: Ken

Location: Google Meet

Tentative agenda

Index format being part of the pipeline #767
PR burndown
- Add maxAnisotropy property #864
- Add time query on GPUCommandBuffer #870
- Re-introduce createReady*Pipeline and remove GPUErrorFilter "none" #871
- Add aspect back to GPUTextureCopyView #873
- Make minBufferBindingSize optional #874
- Expand security considerations, split into sub-sections #898
- Require mapped buffer sizes aligned to 4 #899
- Depth clamping extension #900
Agenda for next meeting

Attendance

Apple
- Dean Jackson
- Justin Fan
- Myles C. Maxfield
Google
- Austin Eng
- Brandon Jones
- Corentin Wallez
- Idan Raiter
- James Darpinian
- Kai Ninomiya
- Ken Russell
Microsoft
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
Kings Distributed Systems (KDS)
- Domenic Cerisano
Matijs Toonen

TODO other topics

Administrative Items

Still waiting for feedback on WG formation
Way to structure complex topics: argdown (?)
- DM blogpost
- Was on hacker news front page!

Index format being part of the pipeline #767

CW: D3D12 needs primitive topology on pipeline. Metal doesn’t have this. Big constraint on engine developers. Used only for tri strips, not very common.
- Proposal Tibor made: index format on pipeline would be optional, part of SetIndexBuffer. If you draw with pipeline using strip topology, then the pipeline needs to have the index format set, and needs to match format of index buffer.
- Makes life for people using strips a little worse - have to duplicate information - but makes life for everyone else easier.
DM: makes sense
KN: think this was something we discussed a while ago. Even then I think we were going to make a change to allow the format to not have to be specified ahead of time when unnecessary. Seems fine to me.
MM: counter-proposal: always have primitive restart enabled all the time, and don’t have the sentinel be configurable.
KN: the sentinel isn’t configurable, it’s always enabled, but we need to know ahead of time whether it’s 16- or 32-bit.
MM: ok, PSO creation has to know whether sentinel is 0xffff or 0xfffffff on D3D12, and PSO creation has to be possible before the index buffer is set?
CW: yes
MM: ok
CW: it’s a constraint. At least the proposal put forward makes it less bad in most cases.
MM: if you don’t use strips then D3D sets it to anything? And if you put 0xffff in index buffer then it doesn’t matter because it’s not a strip?
CW: correct
RC: didn’t weigh in, but wondering whether it’s the case (will research offline) that we pass in the one with the most f’s, and if it’s 16-bit, then we only use the first part of the 0xffff.
KN: we researched this earlier.
DM: did experiments / examples to test D3D. Bug on AMD about treating it one way, NVIDIA another way. Concluded that way it’s specified is the way it’s supposed to go, so can’t always use all 1’s.
RC: thanks for the reminder.
KN: there’s pipelines.md which describes this but may not go into specifics. I trust our previous decision.
RC: I can double-check this with D3D12 folks, sure.
CW: imagining that you need distinct 0xffff or 0xffffffff values for 16- and 32-bit indices, then would people feel OK with solution proposed?
KR: from the WebGL side where we had to deal with this stuff, I think the proposal is good.
RC: where’s the proposal?
CW: #12 down in the thread, with one thumbs-up.
RC: seems fine to me. Only people who use tri strips are penalized.
CW: one additional validation check per draw.
CW: who can volunteer to write the spec? Jasper? Tibor? :)
CW: follow-up question: where should the index format be in SetIndexBuffer? At the beginning or the end?
- Editors to figure this out.

PR burndown

Add maxAnisotropy property #864

DM: discussed this recently, not decided whether worth exposing a limit. Made a change that doesn’t use a limit. Users can select maxAnisotropy, and we’ll use it how we want. Usually uses native API. In few cases it won’t. Outside range of 1..16, we’ll clamp. Not an extension, just part of sampler functionality in the PR.
CW: sounds good. Let’s move forward with it.

Add time query on GPUCommandBuffer #870

KN: not sure whether this is ready to go yet, but unless concerns about the general shape of the API, we can go forward.
MM: what happens if command buffer is an error?
KN: we’ll have to spec that. Good point.
MM: it’s a Promise because it’ll be resolved after CB completes?
KN: yes.
MM: so rather than putting value on CB, 0 until done, it’s a Promise?
KN: yes.
MM: you can put two handlers on a Promise, so that’s fine. Error handling is there.
CW: do we have to spec order in which Promises retire? Buffer mapping, fences, etc. All of them are linked to the GPU timeline.
KN: think it’ll come out naturally as we write the spec text. Here’s the order in which things should happen, and we should have tests around it.

Re-introduce createReady*Pipeline and remove GPUErrorFilter "none" #871

KN: did we discuss this already? Did we come to a conclusion?
MM: I didn’t like that this wasn’t the only way to do it, but you all convinced me.
KN: forgot to write resolution on PR. Will recheck minutes, if it was resolved, we’ll call it resolved.

Add aspect back to GPUTextureCopyView #873

CW: D3D12 has different aspects for depth and stencil of combined depth/stencil views.
CW: all good with everyone?
DM: soft-blocking is gone now?
KN: yes, talked about that one in depth.
CW: we agreed at VF2F to only support those copies that are trivial to do, for now.
KN: wanted to revise this slightly. In some cases it’ll do both depth/stencil copies, but will be split to do those separately - simpler for now. Can change it later.
CW: requiring copies to be separate - we do copies of multiple array layers now. Probably simpler to remove it for now.
KN: comparable to making copies of multiple mip levels at the same time, which I don’t know whether is explicitly allowed / disallowed. Not really critical.
DM: how/when will we be able to copy the stencil?
KN: should be able to copy to/from the stencil aspect of Depth24/Stencil8.
DM: at some point in the future? And what’s the layout of the buffer that’ll be copied to/from?
KN: for depth/stencil it’s tightly packed. Reasonable interpretation.
DM: if in Vk you copy from buffer into D24S8, you can get it so 32 bits correspond to depth/stencil? But if only stencil, treats buffer as tightly packed?
KN: yes. There’s a quote on the issue from the Vulkan spec.
DM: ok, great. So we can allow those stencil copies today?
KN: think so, yes.
CW: just need aspect member to be able to select copies from stencil.
DM: so, deferring copies to/from depth, but allowing stencil here?
KN: allowing D32F too. For now that’s the only case it’d be used - stencil aspect of D24S8.
DM: OK.
CW: even if it’s the only copy that uses it right now, we should still have it.
KN: think we will have concrete tightly packed formats that’ll use it.
MM: how is this implemented on Metal?
KN: think it behaves similarly, but don’t remember off the top of my head.
MM: copy command in MetalBlitCommandEncoder doesn’t take this option.
AE: it takes a flag about depth-only or stencil-only.
- https://github.com/gpuweb/gpuweb/issues/652#issuecomment-604747883
KN: it’s in a comment in the issue.
DM: and D3D also expects tightly packed?
KN: don’t know offhand.
RC: Kai put together a good summary. Said we wouldn’t expose the “plus-ness” by allowing you to read back Plus formats to the CPU.
DM: that was mainly talking about depth. Now I’m curious about stencil copies. Think we can unlock those with this PR. That’s why I’m checking whether you can do all of these on all 3 APIs. D24S8, copying to/from it with stencil only, do you expect one byte per 4-byte texel?
RC: not sure.
MM: the Metal blit option is only when you copy between buffers & textures, not between textures. So copying from depth component of 1 texture to stencil component of another won’t work?
KN: didn’t realize that.
MM: might not work anyway because formats are different.
CW: is there a way to create MtlTextureView with only one aspect?
MM: don’t think so. There’s an aspect mask, don’t know how that could be implemented.
CW: we don’t implement this one so don’t know if it works.
MM: my strategy is to look at the Dawn sources to see how you did it :) so if unimplemented, maybe we have to remove the field?
KN: Discussion about NewTextureViewWithPixelFormat. Appears you can create X24S8 from D24S8. Can’t change the size. Not sure about copying from X24S8 - does it pack?
MM: I can open a new issue about this.
KN: it’s tricky, will need full investigation. Should we leave #873 for now? Not sure if we can use it for anything yet?
DM: maybe we can put it on the BufferCopyView? Then it’ll only affect buffer-to-texture copies?
MM: not sure if it’s the right place, but strategy could work.
CW: keeping it on GpuTextureCopyView but make it default to “all”?
KN: yes, then add extra validation that Myles was talking about.
MM: have to consider all possible cases.
KN: I have rules in mind for them.
RC: which D3D12 thing should I investigate? There’s nothing allowing >8 stencil bits.
KN: copying out of stencil aspect of D/S format to/from buffer, is it tightly packed, or do you get interleaved data?
DP: think it might be driver-defined. Have to query the layout. Can’t make assumptions. Seeing if there are any validation rules.
KN: that’d be surprising but let’s investigate.
DM: I think we should wait to merge this.
AE: I thought in D3D12 they were always separate sub-resources and always different planes, so they’d always be packed.
RC: I’d bet that you’d get only the 8 bits.
DP: think the way it’s spec’ed is can be implemented differently by different hardware. Have to get copyable footprints to figure out the layout.
KN: on Metal don’t know how this works either.
MM: I can take an AI to do the Metal investigation.
DM: think more investigation is needed and that we can not merge this now.
CW: agreed.

Make minBufferBindingSize optional #874

MM: think we decided to make this optional, looked in spec and it’s not. Two pieces:
- Way this works: in bind group creation / layout, you say min buffer size. Is buffer big enough for layout?
- When try to use layout at draw call time, layout has to be big enough for shader attached.
- Think only one of these comparisons is handled by existing spec text.
DM: was spec’ed as optional already. Subject of incomplete spec that we didn’t have all clauses in place yet.
CW: not sure I understand. Problem was, in spec, always required minBufferSize? If 0, could never use any shader that uses the buffer?
MM: if you set 0 for the second check, it’s different than not enabling the checking. We can discuss this next week and I can take a week to remember what’s going on.
KN: minBUffersize is optional, defaults to spec’d value of 0, means we have to do checking later.
CW: think spec text for buffer binding validation says that effective size of buffer must be more than the minBufferSize, be it 0 or something more. Was the concern that the spec wasn’t covering minBufferSize not being set?
MM: think we should discuss this next week.

Expand security considerations, split into sub-sections #898

CW: this is a great PR from DM and JG. Vastly expands security considerations of the spec. Think what this group should discuss: we should all laud the effort to improve this part of the spec. Where in the spec should it live? In the spec? Appendix? Alongside it? Don’t think we have to discuss the contents in this meeting.
KN: think it’s fine where it is. Don’t think we should have a side document for it.
DM: we probably will have a few of those big topics right where they’re mentioned, it’ll split the API descriptions. Makes sense to move them into appendices. Rasterization rules, this one, etc. Do think it should be in the document, maybe at the end.
KN: fine with me.
CW: maybe we can leave this mostly to spec editors then.

Require mapped buffer sizes aligned to 4 #899

CW: this is to be equivalent to validation rules if you had buffer aligned to 4, did mapAsync, and it resolved immediately.
DM: we could support non-aligned sizes. Think it’s too much unnecessary work. Easier to say size needs to be aligned. We issue copy from staging area into resource, and the copy needs to be aligned.
CW: OK, pretty non-controversial, can just take it in.

Depth clamping extension #900

DM: introduces extension letting you spec rasterization mode, so primitives outside clipping range are clamped to viewport depth min/max, instead of being clipped.
DM: allows you to render into shadow maps. If near/far planes aren’t around all occluders, they still get into the shadow map. Otherwise you’re lighting something that shouldn’t be lit.
DM: this is used well, needed, but not supported everywhere.
RC: where’s it not supported?
DM: some Android devices on Vulkan.
CW: Qualcomm devices, too many to ignore.
MM: is this an extension in Vulkan?
CW: it’s an enableable feature, built into the spec.
CW: features that aren’t supported on low/medium-end Android devices today, but which will be supported everywhere going forward, are great candidates for WebGPU feature levels in the future.
MM: why’s this an extension rather than feature Vulkan-style?
DM: same thing. Extension in Vk can bring new features. At VF2F Kai suggested to make everything a feature in WebGPU.
KN: renaming extensions to features in the spec.
CW: GpuExtensionName would be GpuFeatureName. Can have “extensions”, documents describing 1 or more WebGPU features outside the main spec. Ray tracing extension, etc.
MM: question was, why did you opt to require developers to explicitly enable this, rather than the device telling the author that depth clamp is supported or not?
CW: part of the getAdapter and opt-in to additional functionality. If you do nothing you get the base WebGPU device, if it works with that will work everywhere, and you have to opt-in to new features.
MM: OK.
CW: in this we talked about depth bias clamp?
DM: it’s already in spec, nothing to be done about it.

Agenda for next meeting

DM: more copies of depth formats?
MM: probably good idea.
DM: any opposition to splitting spec into files like Kai wanted? Waiting for something on that.
KN: were waiting for make online to support it. Haven’t heard anything about that.
MM: kind of prefer having one file.
KN: maybe not worth worrying about since we don’t want to do it without supporting make online. Not sure anyone feels deeply enough about it.
CW: easy to do later if we want.
DM: ok.
CW: Your Topic Here.
DC: Kai and I were talking on Matrix last week about extension mechanism. Request adapter, gives you extensions on system, you can select among them. That’s specially relevant to multiple GPUs on a system. Started touching on security issues, fingerprinting.
- CW: you get a list of extensions per adapter.
- DC: way it’s designed now is you get a list of all - iterate through all cards, get set of all extensions.
- JG: that’s not the expected behavior
- KN: 1 adapter == 1 piece of hardware. Each one has features exposed by that hardware. In the spec there isn’t a way to get the list of hardware adapters. You can only request one, and get an adapter for it. Was going to make new proposal on that API surface but haven’t done it yet
- DC: could we talk about that during the next meeting?
- KN: would like to make proposal before that and don’t think I’ll have time before next meeting.
- DC: ok.
- CW: DC you could put forth proposal. If you get it done before 48 hours business time before the meeting, can discuss it. Will discuss details on Matrix chat.
CW: adapter rework proposal?
CW: smaller issues.
CW: robust access of vertex and index data (again). Someone will have to volunteer to make summary discussion (Idan?)
MM: minBufferBindingSize optional thing?
- CW: OK.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU Web 2020 07 06

Tentative agenda

Attendance

TODO other topics

Administrative Items

Index format being part of the pipeline #767

PR burndown

Add maxAnisotropy property #864

Add time query on GPUCommandBuffer #870

Re-introduce createReady*Pipeline and remove GPUErrorFilter "none" #871

Add aspect back to GPUTextureCopyView #873

Make minBufferBindingSize optional #874

Expand security considerations, split into sub-sections #898

Require mapped buffer sizes aligned to 4 #899

Depth clamping extension #900

Agenda for next meeting

Clone this wiki locally