Minutes 2021 01 11

GPU Web 2021-01-11

Chair: Corentin

Scribe: Ken

Location: Google Meet

Tentative agenda

PR burndown
- Remove GPUDevice.{adapter,features,limits} #1309 (or should we instead add more reflection for creation info?)
- Rename GPUDeviceDescriptor features/limits to nonGuaranteed* #1313
- Pluralize GPU.requestAdapters #1314
- Add GPUDevice.destroy() #1316
- Rename GPUVertexFormat members to match GPUTextureFormat #1322
- Change the name of CreateReadyPipeline to CreatePipelineAsync #1336
Query API: Maximum limit on query count #1324
Query API: What is the GPU timestamp unit on Metal? #1325 (really asks whether the timestamps should be converted to ns on the GPU or by the application)
Refactor GPURenderPipelineDescriptor API #936 and #1307
Only allow 2d textures for copyImageBitmapToTexture for now #1335
Agenda for next meeting

Attendance

Apple
- Dean Jackson
- Myles C. Maxfield
Google
- Austin Eng
- Brandon Jones
- Ben Clayton
- Corentin Wallez
- Dan Sinclair
- Kai Ninomiya
- Ken Russell
- Shrek Shao
Kings Distributed Systems
- Hamada Gasmallah
Microsoft
- Damyan Pepper
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
Mehmet Oguz Derin

Administrivia

DJ: Chair for WGSL meeting tomorrow.
CW: Cancel next WebGPU meeting for MLK day.
CW: TAG review stuff.

PR burndown

Remove GPUDevice.{adapter,features,limits} #1309 (or should we instead add more reflection for creation info?)
- KN: Discussing with DM, figured out that it made the spec more complicated to spec limits and features on the GPUDevice. DM pointed out that it is reflection that we don't usually have. Question: should we keep reflection on GPUDevice?
- MM: one immediate reaction - if you ask for a GPU with some attribute and there isn’t one with that attribute - can you get a GPU without that attribute?
- KN: at creation time you get exactly what you asked for no more no less. Or lost device.
- MM: so there’s no new information here. I think I agree with Jeff that it’s just convenient.
- KN: couple things to inform this. Middleware would really like to know creation parameters of every object. They can do so by hooking the create function & attaching the descriptor to the resulting object. Do we really want to force middleware to do that, or let them be more modular? Second, almost everything else in the web platform is reflectable. Doesn’t mean we should do this, but indication of matching the rest of the platform. Should perhaps consider keeping the descriptors around. Don’t think it’s that bad. Has some content-side memory cost, but pretty small. Maybe wouldn’t do this for really complicated stuff like the render pipeline, bind groups, etc. Would be easy for textures, buffers, etc./
- MM: let’s start with those.
- BJ: counterpoint: some older Web APIs do reflect, but my experience as feedback from the TAG that newer APIs do not reflect creation parameters, as much as possible. Reason: anyone who does care about that can patch it on to the object at creation time, or hook the creation function. I fall in the middle - I do like seeing some of these things on the device, etc.
- MM: what do native APIs do?
- JG: it’s a mix. D3D likes to provide reflection; Vk no; OpenGL used to like reflection but not much recently. Metal’s mixed internally. From my work with debugging things, saves me number of steps when working on e.g. D3D and I can reflect the parameters of a texture object, etc. A couple of years ago I would have say, don’t worry about reflection, but more recently with more debugging I’ve liked having the reflection information.
- KN: advantage over WebGL - we don’t have to duplicate state tracking on the client.
- CW: possible - maybe - to reflect asynchronously? So we don’t store on the client side, only the GPU process side?
- MM: does that make anybody’s life easier?
- KN: I think so - if you’re writing middleware you can request the async thing and store it, rather than hooking the creation functions.
- MM: if goal’s for middleware to store these params, (? sorry missed this)
- JG: main advantage for the middleware I’ve written - knowing the parent object. Have a texture? What’s the device.
- KN: wanted to include that too.
- JG: downside is that that’s a strong reference. Just holding onto a textureview somewhere can keep the device alive. Might not be what you want.
- CW: topic of reflection is something we want to revisit. Decision on GpuDevice probably tied to the rest of the conversation.
- MM: (1) do we want parent pointers? (2) what do the native APIs do?
- KN: this doesn’t relate to native APIs. We’re not talking about getting it from the native API, just what the user passed in.
- MM: that’s fair.
Rename GPUDeviceDescriptor features/limits to nonGuaranteed* #1313
- KN: if you turn these on, you lose device compatibility.
- JG: I sort of like “extended”.
- KN: thought it wasn’t in your face enough. Length doesn’t matter - you type this once in your entire application.
- JG: feels a little too in my face. I know it’s not portable. Can we make it a little more polite? “optional” is the wrong connotation. I like “extended” because feels like you have the “base” thing, and hardware can optionally extend it more.
- KN: “extended” isn’t obvious to someone for the first time that it’s not a browser feature but hardware feature.
- MM: when I see “extended” I think of the PAE support in processors.
- JG: maybe wait and see if people have other ideas?
- MM: extremely weak preference for accepting this pull request.
- KN: had agreement from Corentin, etc.
- JG: ok.
- CW: not polite but gets the point across.
- KN: merging later.
Pluralize GPU.requestAdapters #1314
- KN: does anybody want to discuss this for more than 5 minutes?
- MM: we probably do.
- CW: agenda for next meeting as standalone item then.
Add GPUDevice.destroy() #1316
- KN: would immediately destroy device, make it invalid, clean up all resources associated. Something WebGL doesn’t have except in the form of a testing-only extension. Think it’s something that would be quite useful, esp. if a canvas goes off-screen and you want to tear down the thing that was rendering to it to save resources. Today, garbage collectors don’t understand GPU memory pressure. Can in theory be fixed but relies on heuristics for big chunks of resources.
- JG: strong preference for destroy().
- MM: as long as well defined, in favor. We have destroy methods for buffers & textures too? Also destroy() on query heaps? Maybe spray destroy() methods around?
- KN: right now we only have destroy() on buffers and textures because they represent variable amounts of GPU memory.
- CW: questions about e.g. destroy() on pipelines. Have to check object references by command buffers to make sure they’re not destroyed. Makes sense for resources but maybe not for pipelines.
- BJ: does that imply that even when a pipeline’s GC’d there’s still a representation for it somewhere?
- KN: yes, if there’s a reference under the hood.
- CW: can also be true for textures - bind groups, pipeline chain, etc.
- BJ: what’s the motivation for not firing device lost?
- KN: things you’d want to do upon device lost wouldn’t be the same as things you do when destroying it explicitly.
- JG: I agree with that, but want to make it as easy as possible for people to handle context lost. Would like to have one concept. Your thing is lost, and if you want to revive it, do it a different way.
- CW: if destroy() doesn’t invoke the lost callback, we’ll probably add “loseForTesting()”.
- MM: can destroy() take an argument?
- KN: it could. Was thinking .lost could tell you whether it was lost for destroy() or not.
- BJ: I like that.
- KN: happy to change that.
- JG: think that can be added later; inclined to just take this.
Rename GPUVertexFormat members to match GPUTextureFormat #1322
- KN: names we have for GpuVertexFormat come from Metal. Completely different format than GpuTextureFormat. Even though you end up getting the values in the shader as comparable data types. Vertex formats don’t declare, this channel is red, etc. You just have 2, 4 channels, etc. But you get them as a vector in the shader and can do .r, .g, etc. Makes it consistent with the other naming; overlap between names, but that doesn’t matter. Overall, this naming’s clearer about what the format is.
- MM: I disagree with your last statement. Vertex formats aren’t colors, so naming them .rgba is silly.
- KN: a lot of vertex data is frequently colors and texture data frequently is not colors.
- JG: sometimes not true, but not enough ambiguity to make it hard for people reading the code. A lot hearkens back to vectors with .xyzw. Arbitrary data, we just call them colors.
- CW: difference: in textures we expect channels to be different sizes. Vertex formats are more like plain data, just the hardware accesses it. Slight preference for keeping the names as is. Less semantic overloading - “this one’s a color, this one isn’t.”
- KN: Part of motivation not just that it’s inconsistent - but the only place where we use char, short, etc. in the spec. It’s pretty inconsistent with places where we say 8snorm, 16uint, etc.
- CW: what about float16x2, etc.?
- KN: suggested that at some point. I like it a little less.
- BJ: for textures I’d prefer to discard rgb as well for the reasons you mentioned. But formats also describe compressed textures, etc. Here, we have a syntax how these things are read in the shader - WGSL - brackets, etc. Not too many places where WGSL syntax is reflected. Intentional? Could you just say this is a vec4f32? Just reflect that across the board?
- MM: agree with the philosophy of what you’re saying. We’re still debating typing in WGSL. Might even be multiple ways to spell the same type. Maybe should be unified.
- DM: also in D3D it’s the same type. Different values with the same type - close, maybe better. Only Metal has different vertex/pixel format types.
- CW: hitting the time box. Brandon made an interesting argument, should we wait until WGSL type name discussions are done?
- MM: sounds good. Add something to WGSL prioritization, figure out what the type names are?
- CW: sure.
- DS: it’s news to me that the WGSL type names were still up for discussion.
- MM: I filed two items about this. One was deferred. Can point you to the issue.
  - https://github.com/gpuweb/gpuweb/issues/736
- DS: have to look back at the issues again.
Change the name of CreateReadyPipeline to CreatePipelineAsync #1336
- CW: Jiawei was implementing it and thought it would be better to name it “Async”. User feedback. Might be worth talking about again. We have MapAsync, so naming this *Async makes it clearer.
- KN: when making the proposal for CreateReady - explicitly didn’t use “Async” naming because it would imply CreatePipeline is synchronous. I don’t like that implication. To dev looking at API, it’d be reasonable for them to assume shader compilation will be done in createRenderPipeline.
- CW: From POV of developer - it sort of is happening synchronously.
- KN: but it’s not sync. Wouldn’t want to do it from worker.
- BJ: people might get the wrong impression that this call’s faster than it is, without understanding the costs farther down the line.
- KN: I don’t object to a new name, but *Async isn’t the right term.
- BJ: I don’t think that Async is wrong.
- MM: Deferred seems like a good word. But we’d have to sprinkle “Deferred” everywhere.
- KN: or swallow the inconsistency.
- DM: I like the PR.
- MM: what’s your reasoning?
- DM: we have 2 async methods, from the POV of where you’re getting the control flow, and this is fairly consistent.
- JG: I sympathize with wanting better names but don’t think we have it yet.
- CW: when making async methods do we have a naming convention? Fetch is an async operation, for example. DIfferent verb conveying it’s async? RequestPipeline?
- JG: it’s not just async version of the sync-and-therefore-bad createComputePipeline. await createdComputePipeline(...)? Doing different thing, not just an async version. Want to await it actually happening rather than patched in a weird way.
- MM: I think DM’s convinced me. I buy DM’s argument - the code the author writes is different. One is straight-line, the other isn’t.
- KN: it could be synchronous, but there’s a strong expectation that it’s not. Apps should design assuming it’s not blocking the JS thread.
- DM: if group isn’t convergent we can leave it to the Editors. :)
- MM: in general if standards group can’t agree on a change then it’s not made.
- KN: I feel I’m the only one against it.
- CW: I don’t have a strong opinion.
- JG: don’t worry Kai, I’m not convinced of it either.
- CW: can iterate on it more offline.

Query API: Maximum limit on query count #1324

CW: Metal uses counter buffers for timestamps and pipeline stats. Should have limit so we don’t go over Metal max size. Also, apply to occlusion or just to timestamps and visibility?
MM: because limit’s so high I don’t think people will hit it, so don’t have an opinion.
CW: does 8000 occlusion queries sound like enough for everyone?
MM: don’t expect any dev will hit that.
KN: that’s 8000 slots in the query set you can write to.
DM: to relax that, would we hypothetically have backends managing multiple query sets behind the WebGPU query set?
KR: WebGL implementations typically have to do this.
JG: only if you’re virtualizing contexts.
MM: only worth the complexity if we really need to.
JG: what happens if you exceed this limit?
CW: this would be a validation error in that case.
MM: this proposal says, if you create a query set larger than 8000, you get an error.
DM: let’s have it. Probably also limit for buffer sizes, but different topic.
CW: do we see future query types we’d like to see more than 8000?
MM: can do that later. If we add a new query type we can make it not subject to this restriction. Can say 8000’s enough for everybody, and adjust later.
CW: there’s a bunch of raytracing queries like acceleration structure compacted size, etc., but still don’t think we’ll need this if/when we get to raytracing.
CW: ok, 8192.
MM: blanket limit of 8192 on every type?
CW: yes.

Query API: What is the GPU timestamp unit on Metal? #1325 (really asks whether the timestamps should be converted to ns on the GPU or by the application)

Past discussion
CW: should queries give you uint64 ticks instead?
MM: thought we already decided this. If platform API gives you ticks you (the browser) has to run a compute shader converting to ns.
DM: same argument apples. More work for impls. Why do this if they just have to multiply by a number?
MM: portability.
DM: anything can happen between queues. Memory can starve, etc. The numbers aren’t portable.
MM: I can imagine an app saying “if my frame time is greater than x”, do something.
DM: there they’d use the command buffer execution time in ms, not this.
CW: expectation - mainly for devs doing per-frame profiling. Bit less concerning. Execution time should be what people use.
KN: even cmdbuf exec time isn’t useful for throttling. Can submit multiple cmdbufs, but have to track them all in the end. Can look at total rAF frame time and learn prob the same info.
CW: seems that we can ask the dev to multiply by a scalar.
MM: what’s the argument for exposing ticks? Clear argument against.
CW: less work - don’t have to run compute shader. You’re using timestamps to measure perf, and we’re injecting compute shaders in the middle of your frame. Also, Metal doesn’t expose conversion from ticks to ns. We could have a mode like that. You can still see your ticks going down as you do optimizations.
MM: I’d like to discuss this with the Metal team.
CW: sounds good.

Refactor GPURenderPipelineDescriptor API #936 and #1307

DM: two proposals - refactoring pipeline descriptor to have some guidelines.
DM: 1: per logical stage structures. You describe how the pipeline processes data a stage at a time. Ex., depth bias is a property of the rasterization state. In aspect-based, part of the depth-stencil state.
DM: 2: everything related to depth is one structure; color processing another structure; primitive processing a third structure. Aspect based structure.
DM: trying to pick one.
MM: I like the aspect-based one. :)
BJ: I vote for the aspect-based one too.
KN: when DM and I discussed this, wanted to maintain pipeline ordering. Know what happens when. But DM did such a good job with the aspects based one that I prefer it too.
CW: slight preference for aspect based too.
CW: anyone strongly against aspect based? Or can we iterate offline on the aspect-based one?
DM: I’ll make a PR for the aspect-based.

Only allow 2d textures for copyImageBitmapToTexture for now #1335

MM: aren’t ImageBitmaps 2D?
KN: yes. Issues are 1D and 3D. Could have 1-dimensional and theoretically copy into 3D texture. Copy into slices of 3D textures. Doesn’t remove copying ImageBitmap into layers of 2D textures.
MM: this is a good restriction.
KN: I agree.

Brandon - clustered forward lighting demo

Github link

Agenda for next meeting

Pluralize GPU.requestAdapters #1314
Renderpipeline refactor?
Timestamp vs. ticks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 2021 01 11

GPU Web 2021-01-11

Tentative agenda

Attendance

Administrivia

PR burndown

Query API: Maximum limit on query count #1324

Query API: What is the GPU timestamp unit on Metal? #1325 (really asks whether the timestamps should be converted to ns on the GPU or by the application)

Refactor GPURenderPipelineDescriptor API #936 and #1307

Only allow 2d textures for copyImageBitmapToTexture for now #1335

Brandon - clustered forward lighting demo

Agenda for next meeting

Clone this wiki locally