Minutes 2019 10 07

GPU Web 2019-10-07

Chair: Dean

Scribe: Ken

Location: Google Meet

TL;DR

Admin
- Ping your Advisory Committee rep
- The WG will be slightly more than a rubber stamp for the CG - it’s also the input channel for reviews from other groups.
Spec workflow
- Lots of issues open for each feature - investigation, issue, spec PR, test PR
  - Use variant of CSS workflow: have a top-level issue for everything, GH Project columns to track its progress.
- Should spec editors hold a separate VC?
  - Yes, to coordinate and figure out the workflow.
- How to track requested changes and also merge spec PRs more quickly?
  - Maybe use Bikeshed’s wpt test linking as part of workflow.
  - Editors to discuss.
add back setSubData? #418
- No agreement yet
  - JG would like for apps to roll their own.
  - KN/AE need to think more about setSubData alternatives. E.g. make createBufferMapped
- How about buffer views? #156. No one has continued thinking about this.
- Command buffer inline uploads?
  - Attaches data to the command buffer lifetime, which isn’t fully specified.
  - See #154
- Is it bad if buffer.destroy() is essential?
  - Mixed opinions:
    - It’s very unidiomatic for JS.
  - But:
    - Requires significant effort to inform GC of GPU memory pressure.
    - We can help people stay on the good path by issuing warnings on GC.
setVertexBuffers default offsets / array performance #421
- Perf results show singular setVB is much faster than plural setVBs in both Chromium and WebKit.
- Resolved: switch to singular setVertexBuffer.
- New idea: a prebaked object containing all the vertex buffers, pipeline state, bind groups. One call to bind. (Like GL VAOs but better.) Helps both performance and convenience.
GPUShaderTransformFunction #454
- Allows shader editing in dev tools, from any source language.
- MM: We should aim for full debugging. Ultimate goal should be the best the platforms can support, not just shader editing. We should figure out what that could look like.
  - JG: Would like to be able to have something in MVP, even if incremental. Lots of people debug shaders by edit-run.
  - KN: Unclear whether any advanced debugging support is possible at all. Making “transform” an extension would give us the freedom to replace it later.

Tentative agenda

Add GPUBuffer.setSubData() back #418
How to add default offsets to setVertexBuffers? #421
Add GPUShaderTransformFunction to GPUShaderModuleDescriptor #454
Myles’ investigation.
PR burndown
Agenda for next meeting

Attendance

Apple
- Dean Jackson
- Myles C. Maxfield
Google
- Austin Eng
- Kai Ninomiya
- Ken Russell
- Shrek Shao
- Ryan Harrisson
Intel
- Yunchao He
Microsoft
- Damyan Pepper
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
Joshua Groves
Mehmet Oguz Derin
Timo de Kort

Admin Items

Reminder to everyone to ping their Advisory Committee (AC) rep
One of the W3C’s reps read our feedback - we will not always be a rubber-stamping group, sometimes there’ll be feedback from other groups and W3C membership.
- DM: what feedback will that be?
- DJ: this just means - when you start publishing something on W3C standards track there are checkpoints along the way. Call for review - other groups can formally review the spec. Example: accessibility folks might say, please clarify this section on accessibility. Or WebAssembly group might make comments about some things. We’d take those back to the CG and see if we need to make any changes. It’s a formal step. Not reviewing everything we do, but the spec as a whole, at checkpoints.
- DM: we have to go through these steps every time we publish a spec update?
- DJ: no. Every time we publish a recommendation. We could also have this a living standard, like HTML. But when we want to publish a version 1.0 or 1.2, e.g. - major versions.
Meeting location?
- RC: working on it. Need couple more weeks for final signoff.

Status Updates

Google
- Working on emscripten bindings
- Working on common C headers with Dzmitry
- Updating Dawn to match this
- Windows is coming along! Rafael working on this.
- Feasibility of forking SPIR-V spec is being investigated. Updates will be given to this CG as we learn more.
Mozilla
- Landed IDL update. Landing impl into Gecko.
- Have to establish IPC, error handling, etc.
Apple
- Justin started writing tests, haven’t been submitted yet
Microsoft
- Refactoring some error handling.
- Intel folks have been working on memory management in Dawn in D3D, and residency management as well.

Workflow

DM: fact of separate repos for CTS and spec mean there are a bunch of Github issues for each task. Investigation, then concrete issue, the PR for spec, then PR for test, etc. All for the same feature.
- Model in Github Projects doesn’t work like that.
- Ideal: bunch of Github issues attached to the thing that moves forward. Can we do that?
DJ: would you prefer one issue managing everything? One issue with sub-tasks?
- DM: don’t yet know how to best do it. Ideally, construct in Github projects to allow multiple projects to be included in it. Would move through the tabs. At least, can artificially create it - can create items in projects and link items.
- DJ: would be nice - can’t do parent/child relationship between Github issues in various projects. Would be nice “show me all spec updates that don’t have CTS yet”. One option - merge all the projects - will only make everything in one repo, not make the issues any easier to follow.
- MM: CSS does the following: using labels “needs-testcase”, “needs-edits”. Only close issue when all issues go away. Useful idea?
- DJ: so you don’t say “Closes #XYZ”, you just remove the label if you do something that fixes the label?
- MM: yes.
- KN: think that would work. Could use the project columns for this.
- DM: like this idea. Issue that moves across the tabs. PRs won’t be in the tabs. Linked to the issues through Github’s linking abilities.
- DJ: good idea? Who could set up the labels?
- DM: I’ll do that.
DM: second workflow: who will work on what? If we setup Github projects the way we discussed previously, sounds good. Editors could have our own little call sometime to coordinate what we’re going to write. Any concern about such a call, or should all Editors’ issues be discussed on main call?
- DJ: think it’s fine for Editors to have their own coordination.
- KN: think it’d be best for Editors to have their own call. Don’t have a good workflow yet. Would be good to have time to discuss one without taking time from this group.
- DM: OK, I’ll organize something.
- DJ: maybe some sort of summary sent to this group with major changes?
- DM: understood.
- KN: maybe first thing we produce is what workflow should be.
- MM: whatever’s chosen, should be able to be checked in to the repo.
- DM: what about a wiki page?
- MM: that’s fine.
- KN: we’ll definitely put up a wiki page when it’s ready.
DM: third issue: when we review pull requests, we may have minor or controversial notes that may hold the whole pull request. How about merging the meat of the pull request and then do smaller follow-ups. Github’s not great for this. But Bikeshed has really nice inline issue tracking. Would be nice to have an example.
- KN: good idea. Looked into this a while ago. Think it would be good. Designed so if you have some spec text working this way, you can point to the test testing that behavior.
- MM: we should avoid having two issue trackers - one inline in spec and the other in Github.
- KN: it’s more for cross-referencing.
- DM: so it won’t address the need I mentioned?
- KN: could use it in our workflow - should discuss in our other call.
- DJ: would be good if we could land changes we agree on, and mark up that some things haven’t been decided. So that person reading the spec can understand what’s been decided and what might change in a follow-up commit

Add GPUBuffer.setSubData() back #418

MM: was this last discussed a long time ago? Should we review?
DJ: summary was, it was in, we took it out.
MM: reason I think it’s worth reevaluating is, the API’s easier to use wrong than right. Reason: 3D devs write: they need to record a frame, record a bunch of drawing commands, need to upload uniforms, would like to upload into uniform buffer, but then have to have Promise that’s resolved. Should resolve Promise ahead of time, but that involves maintaining a queue yourself, lifetimes, etc. I imagine first-time developers will allocate new buffers every frame and forget to destroy them. Think we should add something easier, maybe not as optimized as what we have today, but so that first-time developers don’t write terrible software. Adding setSubData is straightforward, matches some other APIs.
JG: we’re in an in-between stage. Thing missing: map things without having Promises resolved, which is one of the issues. I’d personally rather iterate in that direction than put setSubData back. setSubData’s an anti-pattern; easy to start initially, but going in wrong direction. If impls backfill perf work to make apps fast, there’ll be an arms race in sophisticated setSubData impls. No clean way to solve that other than heuristics. Don’t want us to paint the API into a corner where we have to optimize heuristics for a wide variety of code.
MM: what would you suggest instead?
JG: think we should make it easy to make one-shot buffers. createMappedBuffer, upload to it, copy from it, destroy it. Would also serve purpose of getting people off the ground. Then they can write their own arena-based allocator.
MM: are you suggesting more than what we’ve got with destroy() and createBufferMapped?
JG: possibly. Haven’t written it out yet. If destroy() fails when buffer’s in flight, harder to do.
MM: don’t think those are destroy()’s semantics
JG: think you’re right. Have to look again.
MM: collection of calls you have to make at the right time is still not enough. createBufferMapped, do work, destroy - devs may not call destroy() because they didn’t read that part of spec. Then will allocate memory indefinitely.
DM: indefinitely? Won’t GC kick in?
MM: we’ll allocate until GC - generally bad practice.
DM: what timeframe are we talking about? Day, few frames, …?
KN: depends. These objects do have memory cost. But they won’t necessarily be able to expose memory cost to the GC directly. There’s an issue where the GC doesn’t understand what GPU memory is. Chrome artificially inflates memory cost of WebGL objects. Causes lots of problems. Would need separate memory budgets, don’t have that in current GCs. In practice it’s hard for all browsers to collect these objects promptly.
MM: on macOS / iOS, GPU memory is pinned pages, kernel has no flexibility to move them. The cost is more than just the mem cost; perf goes down too.
DM: use case trying to solve: user encodes command buffer, start render pass, a draw call needs some uniform data, and you’re saying we want to avoid them creating a buffer at this stage?
MM: yes. 1 of 2 things will happen with current API. a) new developer types in the code “mapBufferWrite.then(...)” and the rest of their code’s inside the then block. Then they miss their present; extra frame latency. b) what you said; allocate a new buffer with crateBufferMapped, write into it, then do their draw calls. Buffer hanging around.
DM: if we’re in the render pass we can’t do much with memory; can’t do transfers. Users have to think about this up front and have memory available. It’s not a buffer created per call; a few buffers per frame, because there’s a buffer per render pass, not per call.
MM: are you talking about cmdbufs or uniform bufs?
DM: uniform bufs. You can’t do transfers inside render pass. Today we force users to think ahead of time what data they’ll need throughout the pass.
MM: for data that changes every frame, it’s common practice in Metal to make those buffers shared.
JG: shared buffers are mapped between CPU and GPU?
MM: yes.
JG: would it be interesting to see how complicated setSubData would be if it were written in JS on top of the current API?
MM: so pool of mapped buffers?
JG: yes.
MM: I think so.
KN: thought we had one of those? In one of the Markdown documents, think Jeff Gilbert wrote it.
JG: it’s old at this point then.
MM: next step is to talk about audience again then.
DM: if we have multiple draw calls and want them to have uniform buffers updated - how would setSubData work / make it better?
MM: pool of previously-mapped buffers would be held by impl. Since impl doesn’t have to wait on Promise and can use persistently mapped buffers, setSubData could issue copy by itself - actually, wait, can’t do that immediately. If dest buffer is shared buffer then could just do a memcpy.
DM: so users would do it before every draw call? setSubData with data hte draw call needs?
MM: yes, when they need to upload some uniforms.
DJ: this is what most WebGPU demos do today.
JG: want to make distinction between what they do today and what they might do in the future. Until recently Dawn didn’t support anything but setSubData. If we say, setSubData or a more complex approach, they’ll want setSubData. But once Maps, Unity, etc. pick up the API, there won’t be a lot of setSubData calls, because they’ll be slow, or we’ll have to optimize them a lot.
MM: you’re right Maps won’t want setSubData. Space for both APIs. Out of all the sites out there, many will be using big engines.
JG: then we have to talk about playing heuristic games, picking winners and losers between demos and apps on the web.
MM: I think that’s a better outcome rather than having all those apps have terrible memory access patterns.
JG: would rather give them a library which has known performance characteristics than bake this into the impl. They can statically link setSubData for what they want, modify it, etc. Or manage their data explicitly, which is what we want them to do.
RC: how would Jeff’s proposed one-shot buffers work?
JG: similar to createBufferMapped. Have to have a good story for creating, encoding in command buffer, and destroying them. Agree with Myles that we don’t want this to be impossible to get right. Trickier before we nailed down destroy() semantics; think it’s easier now.
KN: think we need to spend time on this. Austin and I just discussed - think we can adjust how createBufferMapped works to be much more of a solution to this problem. Easier to use, avoiding copies, etc. First thought was - maybe we can replace createBufferMapped with createUploadBuffer - just a CPU-side allocation that exists in shared memory, or host-local-device-visible memory, used only for uploading. Think that lets you get rid of up to 2 copies. (1) if app is decoding, can decode directly into this. (2) on some impls, should be possible to copy from CPU-side memory into GPU resource. Can almost do this today with createBufferMapped. Would like us to all have more time to think about this.
MM: that’s fair. Any solution to this problem would be great - not married to setSubData, but think current design needs to be improved.
KN: I’m hopeful we can improve things.
RC: thought there was going to be views into buffers- any more thought in that direction? Should we work on that sooner rather than later?
KN: don’t think anyone’s thought about this any more, but we should.
- https://github.com/gpuweb/gpuweb/issues/156
JG: we should work on that. In some APIs there’s a concept of command buffer inline uploads. Any thought about this?
DM: the proposal I had was similar to this. Counter-argument was we don’t want data associated with command buffers, since their lifetime isn’t completely specified.
JG: any link to that part of the proposal?
DM: Josh links to it at end of #154.
- https://github.com/gpuweb/gpuweb/issues/154
DM: I like what Kai’s thinking about - making createBufferMapped more efficiently. Can we just say in the spec, delete resources at end of frame?
KN: all resources that can be destroyed early should be. Think we can issue warnings if we see something get GC’d without being explicitly destroyed. Tell devs they should have called destroy().
MM: I don’t think that’s idiomatic JS. Bad to say, you have to explicitly release this object when you’re done. Ability to release them is good, requiring it is not.
KN: I think we should. WebGL’s always done this.
JG: can we leave this for further thought for now?
DJ: let’s leave this open but give it a slightly different title? Idea is, we’ll come back with ideas. Josh Groves is on the call, can he provide input?
JGr: not really, everything’s already been said.

How to add default offsets to setVertexBuffers? #421

AE: we talked about this at F2F. Tested perf in Chromium, but we have a multi-process arch and it’s only measuring client side time. I implemented setVertexBuffer in WebKit. If I did it it’s about 10x faster than the plural version due to array allocations.
DJ: that’s pretty compelling results.
MM: OK, you have convinced me.
AE: think we should remove the array version - ergonomically poor with parallel arrays, and it’s slower.
JG: think we all agree to move forward with setVertexBuffer?
DJ: we all agree we’ll add the singular version that’s faster.
DJ: then do we consider this a replacement? Remove plural version? AE says yes, I think it’s a good idea.
DM: could we follow up with an object that you can bind that has a bunch of vertex buffers, pipeline state, etc. that you can just bind with one call.
KN: I think that’s a good idea.
JG: VAOs are back.
MM: is this convenience or performance?
DM: perf would be a goal but convenience is useful too.
DJ: I like the idea, please do.
JG: can we assign someone to #421?
AE: I’ll take it.

Add GPUShaderTransformFunction to GPUShaderModuleDescriptor #454

AE: thought it would be a neat idea to include a user-level transform function the browser can invoke. Anyone compiling from any language to what we pick, the browser can recompile it if a shader’s edited. Robin had interesting proposal about async sourcemap for it, so you could have compiled code in production environment, but in DevTools would fetch the source map and you could change it at runtime.
DJ: I haven’t read enough of the issue yet. Like the idea. magcius (Jasper) has some comments.
DM: too bad Jasper’s not on the call, but what prevents us from doing this today?
KR: mentioned the need for the browser to know original source.
KN: think it’s the question about letting app developers roll their own developer tools. They can do this today. But the transform’s useful for the primary reason.
JG: Myles points out that this doesn’t provide full debugging. I think it’s super valuable outside debugging / printing, which are much harder problems for GPU shaders to solve. Wonder if we could convince you that this transform is a step in the right direction, and we can discuss break / printing in the future.
KN: it’s a huge project and I think we should make incremental steps.
MM: I’m all for incremental steps but don’t think we should make single steps without vision for where we’re going. Agree it’s a hard project, but in all of the shader debuggers out there, which support breaking? Printing expressions? Expose features that could be exposed in Browser DevTools before we make code changes? Have to know where we’re going.
JG: I’d like this to be in MVP. My experience doing graphics programming, I don’t use debuffers - iterating on shader source code is what I do. I know a lot of people use debuggers, so not dismissing it.
MM: I’m not saying this is useless but seems likely we’ll have to revert this patch inside some larger subsystem later, would like to know what that larger subsystem looks like.
KN: I have doubts that we can do full shader debugging in the browser. Hooking into platform tools that use proprietary APIs in the drivers - no public APIs in the tools either - on most platforms it’s not possible to do shader debugging on the GPU. Some platforms might have extensions that only work on software impls of webGPU. Or build on top of some other public APIs? Worth investigating, but so complicated that we can’t realistically have a solid vision of where we’re going in the near future. Also won’t be platform independent and I don’t think it can go in the spec. Robin had some really good ideas of how this API should look. Some of those ideas are that this could be in an extension. If the extension’s available then we use these new fields. Gives us leeway to take this out later. Shouldn’t break any applications; just a developer tool.
DJ: I agree. We extend the core spec, allow these values in the dictionary, and then the actual implementation is an extension. I’d like to see that as a pull request and then we can talk about it then.
KR: not sure about the value of the stuff be in the core spec but the impl in an extension?
KN: also not sure about that.
DJ: let’s discuss this more.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 2019 10 07

GPU Web 2019-10-07

TL;DR

Tentative agenda

Attendance

Admin Items

Status Updates

Workflow

Add GPUBuffer.setSubData() back #418

How to add default offsets to setVertexBuffers? #421

Add GPUShaderTransformFunction to GPUShaderModuleDescriptor #454

Myles’ investigation.

PR burndown

Agenda for next meeting

Clone this wiki locally