GPU Web 2020 05 18

Chair: Corentin

Scribe: Austin / Ken

Location: Google Meet

Tentative agenda

Read-only depth-stencil pass #746
Issues about copy commands within same resource #783
Texture views cause unnecessary memory bandwidth use #744
Add Queue/writeTexture method #761
0-sized buffer #546 (we didn't resolve last time)
Add minimumBufferSize to BGL entry #678
Agenda for next meeting

Attendance

Apple
- Dean Jackson
- Justin Fan
- Myles C. Maxfield
Google
- Austin Eng
- Brandon Jones
- Corentin Wallez
- James Darpinian
- Kai Ninomiya
- Ken Russell
Microsoft
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
Matijs Toonen
Mehmet Oguz Derin
Timo de Kort

Administration Items

CW: W3C staff is still doing small updates. Please review them, but I don’t expect your legal teams to need to look at it. Very minor.
CW: Haven’t set up the doodle for the virtual F2F.

Read-only depth-stencil pass #746

DM: Main observation is that some attachments are not mutable. You can have a depth stencil attachment with DepthWrite off. It should be totally safe for the shader to read from it at the same time. This is supported since D3D11. D3D12 has flags to specify things are read only (RESOURCE_STATES). Apparently also supported in Vulkan and Metal as far as my investigation shows (if the entire texture is immutable - both depth/stencil aspects). Exposing this to the user would allow them to both bind it as a an immutable attachment and use it in the shader. This would affect validation. Current proposal is that if either flag is mutable, then the entire subresource is considered mutable. If you want readonly stencil and read-write depth, it’s still considered mutable.
MM: What’s the purpose of attaching a depth buffer but disabling depth writes?
DM: For depth testing. Don’t execute fragment shader on fragments behind something else. This is what people do in transparent rendering.
JG: basic sorting, so you don't do too much overdraw, so you don't waste time in expensive pixel shaders because you've already done occlusion.
DM: More preactive example: game rendered in two phases where you first write to depth opaque geometry, and then you write the transparent geometry without writing from depth. However, it may read depth. If it’s water it would want to know the distance to the opaque geometry behind it to compute caustics, etc.
MM: Okay, that makes sense.
KR: Note: related WebGL bug http://crbug.com/763695 - was not feasible to do this in the OpenGL family of APIs because rendering feedback loops could not be defined in terms of whether depth writes were enabled.
CW: seems everyone agrees with overall idea. Two dimensions:
- 1. Is there a single flag for depth/stencil being read-only?
- 1. Is compatibility of pipeline implicit, depth write has to be disabled and that's ... ? Or is pipeline compiled only to be used with render passes that set this flag to true? "readOnlyDepthAttachment=true"? I think this choice is bad so unless someone thinks that should be supported let's not spend much time on it.
KN: think should leave 2 flags in, more expressive.
RC: 2 flags look good to me. I was asked to comment on them on another issue, #514. (read-only depth/stencil usage)
DM: my questions were about case where 1 component is mutable and the other not. Don't think we should explore immediately. Can assume both are mutable or not.
JG: Room for being able to want to mutate one and not the other, and I think we can probably do it?
DM: The use case for that is you would use a stencil bit to mark the water off and then read depth. I’ve used this in production.
CW: "Uncharted" presentation by George Jimenez also did this by marking transparents with stencil so they could use TAA.
JG: Sounds like what we want. My only thought is that it feels like there might be room for these to be store flags instead? Because what does a store flag mean if it’s not writable?
DM: you can discard something while it's still readable...maybe not.
CW: technically, yes, can read-only from it and discard at end of pass. Either it's clear color, or something you loaded from memory before.
DM: think Jeff's right, think there's an opportunity to feed it into store flags. Semantic of store flags is different than here.
KN: isn't this really a third store flag? "always-discard-dont-write"?
CW: concern putting this in store flags is it's different. That's what happens at very end of render pass.
JG: true, but sets up a bigger state space for it to be bigger variable. Anyway, we can move forward with this.
CW: let's move forward with separate flags. Doubly separate, i.e., load/store, depth/stencil.
KN: This is kind of like depthWriteEnable and stencilWriteEnable, but not exactly the same because it imposes validation restrictions?
CW: On SetPipeline, if the render pass is depthReadOnly, then the depthWriteEnable must be false.
MM: will that happen on every setState / setPipeline? Or, if you set wrong pipeline and don't use it? Guess it's a larger point.
DM: we have a larger issue for when validation happens. Consensus was, whatever you use is going into validation, regardless of whether GPU uses it or not. Every setPipeline call would be validated against that.
MM: if you set these flags, you can have same texture bound to depth texture and sampled texture? And if not, can't?
DM: correct. It'd make it immutable usage, which is incompatible with other uses.
MM: and we need these flags at renderpass creation time?
CW: essentially. Theoretically could deduce from usage inside render pass that texture is never written into by depth/stencil. Seems more implicit magic, brittle.
DM: can't introduce any state transitions in Vulkan renderpass. Can't transition depth from writable to read-only inside renderpass.
CW: Right, I forgot about that.
KN: So would this basically just be validated against depthWriteEnable and stencilWriteMask, or are there other ways to have a pipeline that doesn’t write into stencil (like detecting in the shader).
KR: Depth and stencil updates are usually implicit in the shader, so can't scan the shaders' sources to see whether they're done.

Issues about copy commands within same resource #783

CW: started adding GPU-based validation in Dawn. Jiawei found that buffer-to-buffer copies from one buffer to itself, even different regions, is not allowed in D3D. Validation layers suggest that copy goes through temporary buffer instead. Similar to restriction about copying texture sub-resource from one to another, even if regions are disjoint.
CW: Q: do we add these restrictions to WebGPU, or force impls on D3D12 to go through staging copy?
DM: think it's safe to disallow that now.
JG: do we know what state is for Metal / Vulkan?
CW: at least for buffers, it's allowed.
MM: for now we can proceed. Offline I'll figure out whether it's supported on Metal or not.
CW: what do you all think, same way we have meta-issue for TC39 discussion topics, things that are missed flexibility in WebGPU because of one target API or another, to be discussed after V1? i.e., most painful parts?
CW: seems general agreement to do this.
CW: on same issue, Jiawei points out that on Metal only one sub-resource can be copied (one array layer)... and on D3D12 Z must be 1. Right now the text for copy allows copying multiple array layers. Do we want to revisit this?
JG: think this is different enough, overhead of polyfilling more flexible behavior isn't too bad. If had to copy through buffer for memmoves, that is imp't perf consideration. If just for loop over the layers we want to issue copies for, probably not so bad.
KN: that's exactly what I think.
JG: philosophically, if no change to app's behavior if we told them we'd do this one at a time, we should allow it for convenience.
RC: fine to me too.
MM: anyone know what perf overhead actually is?
KN: doesn't really matter. Either app does for-loop or we do. If we do it the overhead will be lower than in the app.
MM: nobody knows of any algos where, if impl can do this, it would work one way, and if not, they'd do it another way?
JG: seems like a super thin needle to thread to find a case like that.
MM: OK. I'm on board with this.
CW: great, resolved.

Texture views cause unnecessary memory bandwidth use #744

MM: Let’s not discuss. Defer to next week.
MM: one thing: I'd like for impls to not have to supply this flag under any situation. Trying to find all situations where the flag would be necessary, then going to supply big PR to spec to forbid those.
KN: spec doesn't say you can't (or can) reinterpret texture view format, though the entry point is there. At some point in past, I’m pretty sure we discussed that to allow reinterpreting format, you'd need a flag, so that's what needs to happen here.
CW: like to go further...Vk has something where at creation of texture you supply list of formats you'd like texture to be compatible with. Allows impl to know exactly what compression bits to disable, for example. For WebGPU, we'd know that on Metal we only need to set pixel format flag like sRGB / not sRGB. D3D12, same thing. Very explicit thing, to set all formats you want texture to be compatible with. Lets impl choose the best flags.
MM: maybe shouldn't discuss this until we have something concrete, but: is that too specific / low-level? Instead, list classes? Say, 3 different classes, can enable subset of those devices. Want texture compatible with Class A (rgba8unorm, rgba8unorm-srgb, ...), Class B might be something else. Maybe discuss at later time?
CW: OK.

Add Queue/writeTexture method #761

DM: We’ve heard requests for this from wgpu users. They want a good way to upload texture data. I’ve wanted this for a while. The nice icing on the cake is that we don’t need to require the bytesPerRow alignment that copyBufferToTexture has. Users acn just decode a whole .png somewhere and pass it to us. We’ll take care of the bytesPerRow, and staging buffers. CW pointed out though that ImageBitmap may serve a similar goal.
MM: that's my feedback as well. Need coherent story for all different ways you can get data into texture. Maybe the split should be along ArrayBuffer and ImageBitmap. But the split should be intentional.
CW: In a sense we already have copyBufferToTexture, but this would be a convenient / easier / one-less-copy-sometimes. Could be in the list of post-MVP features to consider if we get feedback that copyImageBitmapToTexture is insufficient.
MM: is there an easy way to get an ArrayBuffer into ImageBitmap?
KR: You can create an ImageData that wraps the ArrayBuffer, then make an ImageBitmap from that. One line of chained JS calls.
JG: if one-stop shop that's appealing. ImageData is always RGBA8.
KN: also makes copy to make ImageData object?
MM: was wondering whether it would be as performant as what we're talking about.
KN / KR: Probably 1 extra copy in the ArrayBuffer->ImageData->ImageBitmap path.
MM: What’s the existing call for uploading an ImageBitmap into a GPUTexture?
KN: createImageBitmap, and then copyImageBitmapToTexture.
MM: could we unify these and have it take either ImageBitmap or ArrayBuffer?
CW: there's difference - ImageBitmap already has width / height / layout. Buffer doesn't. Need to specify rowPitch, ImageHeight, etc.
MM: so would be dictionary of ArrayBuffer and bunch of other fields, or two entry points. Two entry points makes sense to me.
KN: 2 entry points makes sense to me too.
MM: seems pretty good given discussion already. I like this, seems like good design.
JG: I think this only works for RGBA8 style types though. Would have to change ImageBitmaps, or for MVP you can only upload RGBA8 textures directly.
CW: I think we’ll have to change ImageBitmaps at some point anyway. More and more, you’re going to be able to have float16 ImageBitmaps, and you’re going to want to know what’s inside.
JG: think it'll be hard to standardize set of formats of ImageBitmap that'll cover all bases for texture uploads. People will want strange things like RGB9_E5, compressed textures, etc. Compressed textures won't be standardized in ImageBitmap.
MM: Sounds like yet another argument to accept ArrayBuffer.
JG: yes.
CW: slight concerns that writeTexture is there as convenience, so people don't have to care about rowPitch / ImageHeight, but for most cases where it'd be used for convenience, people just want to use WebGPU and put things on the screen. They use the ImageBitmap path. For more advanced apps they can probably deal with the complexities. I don't know if we need writeTexture or don't need it.
MM: makes sense, but add'l complexity as pointed out by DM is pretty low now. There's certainly some benefit - is impl burden worth it? I think the burden is low enough that it is worth it.
JG: if you can share validation, yes. Need to validate both copyBufferToTexture path and writeTexture path.
DM: the validation's already in the PR. Shared large amount of validation with copyBufferToTexture, not copy-pasted.
JG: On the implementation side, I mean. As long as there isn’t so much extra we have to care about or carve out in addition. In WebGL we have a lot of combinations and cases where we ignore flags, etc.
CW: I think this one is fine. To echo what MM said, the implementation cost should be really low if you use the same mechanism as writeBuffer, and then the validation is super close to copyBufferToTexture. Today, the browser could probably just createBufferMapped and then do a copy.
JG: my feedback was just that it's not as easy to de-duplicate this validation code as you expect. It's possible though.
RC: another issue suggested to change createImageBitmap to say it's "WebGPU compatible". Don't think we can get away with avoiding all data conversions between one and the other. Esp. when ImageBitmap comes from canvas, and format is impl-dependent. Can probably do it for image decodes. "When you make this, please make it like this so it's a simple copy for WebGPU".
JG: we definitely have slow paths for certain operations. Swizzles, premultiplies, etc.
RC: I think we should have writeTexture. Not arguing against. Will probably need some data conversion stuff.
DM: format conversions for writeTexture?
CW: I think RC meant copyImageBitmapToTexture.
MM: Think we’ll have to do those conversions regardless of if we accept this or not.
CW: another thing we need to discuss - exact validation for copyImageBitmapToTexture. This is something Shaobo has been running into. Can you convert RGB9_E5, etc.?
MM: Broad agreement that copying from an ImageBitmap is hard, but writeTexture is not that.
CW: yes. Seems there's general agreement on this proposal?
JG: thought earlier we said wait and see.
CW: I don't mind either way.
MM: think it's better to put it in now. Already have it for buffers. Adding for textures isn't that hard.
JG: if they're doing it from canvases it'll be ImageBitmaps.
DM: also want WASM users to be able to do this too. Don't know how they'll access ImageBitmap.
JG: same way as mapping. Jump out via JS trampolines.
KN: think this will be used by WASM where they have their own decoders. Also probably with Basis, where you transcode into a buffer. Think it'll be used more there than in JS without your own image decoders.
CW: because you're forced to have data in an ArrayBuffer.
KR: Assuming we encourage people to use Basis universal, then it’ll be the same in JS in that they’ll get an ArrayBuffer.
JG: if they're using a library, the library can provide this functionality via ArrayBuffer.
KN: I'd imagine they'd use the Basis library and not a Basis/WebGPU library because the add'l amount of code is small. Maybe would want a small lib doing format selection, etc.
JG: Feels like it’s a relatively small amount if you’re already using a library. If you phrase it in terms of specializing for ergonomics for people who aren’t using a library..
DM: can you clarify? Creating Buffer, putting data in it, etc., as a workaround?
DM: WriteTexture is faster than creating a buffer with data. It’s a linear staging allocator.
JG: what do you mean?
DM: createBuffer with data, don't know when buffer will be released.
JG: we differ on how createBuffer's buffer works. Shmem, or actual buffer.
CW: When you say CreateBufferWithData, do you mean mapAtCreation, or CreateBuffer then writeBuffer.
JG: Either.
...discussion...
DM: think you convinced me today that writeTexture is valuable. Whole team, Myles in particular. I wasn't convinced before today's call.
CW: do we move forward with it?
JG: yes. My objections were basically the same as writeBuffer.

0-sized buffer #546 (we didn't resolve last time)

CW: on the call, people were roughly happy with 0-sized buffers, but we didn't reach consensus.
CW: if everyone's still fine with this topic, we can just say we have consensus and move forward with it.
RC: think we deferred this until we figured out how we'd bind these, and verifying bindings were correct.
CW: think it's independent. Bindings have to give an offset and size upon BindGroup creation. size+offset is validated against size of buffer. 0-sized buffers will prevent any BG to be created from it except size=offset=0. Mostly independent.
MM: There’s two discussion: one is a zero-sized buffer. One is zero-sized binding. What’s the relationship between them. I thought we decided on the second one of allowing zero-sized bindings.
CW: I thought we agreed on 0-sized buffers but not bindings.
DM: I thought we agreed on both.
RC: Issue 686, binding to a zero-sized range of a buffer is still open.
CW: yes.
KN: that's the one for 0-sized bindings.
CW: that one's blocked.
KN: on minBufferSize.
MM: if you can't have 0-sized bindings what's the point of creating 0-sized buffers? Copy 0 bytes into it?
CW: bind as index / vertex buffer, create as indirect and never make any calls against it.
JG: gives you some flexibility rather than having to have branching in your API code. Seems most functional to allow it if it isn't a big deal to handle.
RC: But we would need to stuff internally. Because D3D12 has some APIs that don’t allow zero-sized buffers.
MM: any time any wants to create a 0-sized buffer, you could return a 1-sized singleton buffer. Validation code would check you're not accessing the singleton.
KN / CW: Exactly our intent.
CW: Dawn will probably have a small buffer hanging around for zero-initialization. We can probably just point to that one.
RC: then you need to keep around code, how big is this buffer really?
KN: buffer object always knows this. Just changing what D3D/Metal buffer is underneath it.
RC: what's use case for 0-sized binding?
JG: N lights as input for shader, and a room with 0 lights on.
CW: that's more complicated. Gets back to minBufferSize discussion. On minBufferSize would like it to be static size of FBO + 1 element of variable-sized array. That means you would never have 0 size. Or, 0 size would give validation error with minBufferSize.
CW: do we want 0 sized buffers or do we still need to discuss them.
JG: I'd like to have them and if they're onerous, remove them.
MM: agreed.

Add minimumBufferSize to BGL entry #678

Agenda for next meeting

Texture views cause unnecessary memory bandwidth use #744
- MM: I'd like to see any writeup of API proposal for supplying list of supported texture formats.
minBufferSize, again
Any more depth/stencil stuff? (no inputs)
Compilation result / source map PRs? Jeff updated them.
- KN: hope not; we agreed on what they should look like. In editors' hands.
CW: if more topics, add them to main project board, Needs-Discussion. Sort in project to be at the top. I'll use this to assemble agenda.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU Web 2020 05 18

Tentative agenda

Attendance

Administration Items

Read-only depth-stencil pass #746

Issues about copy commands within same resource #783

Texture views cause unnecessary memory bandwidth use #744

Add Queue/writeTexture method #761

0-sized buffer #546 (we didn't resolve last time)

Add minimumBufferSize to BGL entry #678

Agenda for next meeting

Clone this wiki locally