Minutes 2019 10 21

GPU Web 2019-10-21

Chair: Corentin

Scribe: Ken, Austin, Kai

Location: Google Meet

TL;DR

Error codes #464
- There is appetite to make the CTS better via error codes but worry that it will overly constrain implementations. Some alternative mechanisms to error codes could be introduced.
- Could go either way so we’d like to have an example to discuss more.
setSubData and friends
- Concerns all around about createBufferMapped being harder to optimize than a setSubData
- Discussion that setSubData could be a polyfill and better APIs could make the polyfill more efficient.
- Defer until we have a proposal for buffer views
Advanced feature investigations: will discuss some next meeting.
Workflow: we now have project tracking!

Tentative agenda

Error codes
setSubData and friends
Advanced feature investigations
Workflow
PR burndown
Agenda for next meeting

Attendance

Apple
- Dean Jackson
- Justin Fan
- Myles C. Maxfield
Google
- Austin Eng
- Corentin Wallez
- James Darpinian
- Kai Ninomiya
- Ken Russell
- Ryan Harrisson
Microsoft
- Damyan Pepper
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
Mehmet Oguz Derin
Timo de Kort

Error codes #464

CW: either have single error code, or very detailed ones. Not clear if worth potential impl complexity. Discussed alternatives: having them be optional, could opt in to them being pedantic. Or, should error codes be surfaced to the app at all? Or use web reporting API?
CW: talking with Francois and others, web reporting API is not enough to give developers what they need. Adds lot of friction to using WebGPU correctly.
KN: you’d only want to do that if you’re collecting telemetry. I agree it’s probably too complicated.
DM: Jeff and I talked about this recently. We agree there’s cost to implementing / maintaining the errors, but benefit from maintaining them. Maybe app can opt into more detailed error codes.
CW: do you have an example of additional processing?
JG: in WebGL: have had to figure out whether we can get away with generating a different error code. Or a different optimization you can do if you can skip generating a certain error.
CW: so, if detailed error codes, would like the CTS to say you can use either the detailed one, or the catch-all one?
JG: correct.
KR: pushing a bit based on WebGL experience for detailed errors - to enumerate all the corner cases.
MM: made this point last week: don’t think it makes sense to add any feature to a web-facing API just for the conformance suite.
KR: why not? test harnesses sometimes need privileged APIs to verify things. Witness the wpt harness, reftests, etc.
JG: maybe an extension we have, but just not web facing. Maybe a way to turn it on in the CI builds, but not exposed to general web sites. Don’t have a strong opinion on that.
DM: Don’t quite understand. If we have this functionality, maintain the code, and implement it, why don’t we give it out and hand out error codes? The only cost I see is the cost of maintaining. But if we’re going to do it we may as well expose it.
MM: alternative is, don’t pay that cost, don’t have interoperable error codes, and have an API which just says “you messed up”. We are in control of the test suite. If we want to make sure it covers all the corner cases we can do that without a new web API.
DM: what about a higher level library wanting its own test suite, and they want to see our error codes to make their test suite better?
JG: I don’t want to commit to that. Don’t want to worry about breaking user-land content when we change the situations under which errors are generated.
MM: Also, if we make this an extension, we shouldn’t make it required. Because the benefit of a required extension is that a website can opt into not using it. That isn’t true here. The cost isn’t runtime, it’s on the developers. If we have to maintain this, there’s no reason to make an extension that is required to be present.
CW: what about the following: in CI scenarios, the browser / cts can ask for the extension “give me the error code at the beginning of the error string”. Error code may/may not be present, or have generic error code. But browsers that want to be pedantic can do this, and browsers that don’t can still use it to make sure the test is correct. Different error code -> test is wrong. Trying to maybe get good testing of the test suite because you ran on multiple browsers. Or too much work to specify?
KN: still on the fence. I agree with all the arguments to not include it. At the same time, we can’t assume we’re infallible when it comes to writing tests. It’s ideal to verify that the tests are actually doing what we want. Having some way to do that would improve the quality of the suite.
KR: Microsoft? Thoughts on HRESULTs and sub-error codes. Was there an advantage in those details?
RC: D3D is special in that it has a validation layer you can subscribe to. Sits between API and real runtime. Does validation the API doesn’t do. Including rewriting shader to make sure you’re doing things correctly. Don’t think we should go that far. Agree that frameworks shouldn’t take dependencies on certain error codes. Maybe some sort of option / command line switch, where these error codes appear in error strings? Have those perform as well as they can?
KN: from the perspective of not exposing unnecessary details to apps, I like that, but it doesn’t help with the impl complexity. There’d also need to be an agreed-upon format or way to do it.
RC: Does WPT have something similar? Where it detects a special mode?
KN: Don’t know. One thing they can and sometimes do is they have baselines of what text is expected. I think you can test what text should come out of the console -- and you can have a baseline for your browser devtools. It does mean however that you have to review every baseline.
RC: then as part of running the tests - does baseline have to be the same among all impls?
KN: no, they’re browser-specific. Lot of web platform tests you can’t expect to render the same in all browsers. Ex., font antialiasing. Image tests typically work this way. Think you can have per-browser text tests for what comes out in the console.
MM: the WebGPU tests have to pass on a machine that doesn’t support WebGPU otherwise they’ll get a lot of failures.
KN: that’s the way they’re written now. They use the testharness.js feature of wpt and output yes or no. For browser-based baselines we could check the warning messages. Internal infra to each browser.
CW: seems overall that we should be able to find some other mechanism to test the validation errors.
KR: Would really advocate having logic to assert on a condition. The text baselines are a nightmare and error-prone. If there’s ever a structural change that changes many of the baselines, it’s going to be extremely difficult to get it right. This is a code coverage issue. If we don’t know if are testing all error code paths, we’re in the dark. This allows automated tests which prevent regressions in critical errors.
JG: I appreciate that feedback, think it’s interesting with different feedback coming out of our experience with WebGL. Thought it was useful to have INVALID_OPERATION for most of the generic errors. Appreciate the desire to do better this time, but I don’t think it’s better.
CW: is it better to say that error codes were more of a hindrance than a help for your impl? Because you were the second impl and the tests were written against the first one?
JG: in WebGL 2.0 in particular there were many cases where we checked the “wrong” error code in the tests.
RC: I also had that experience, but found it easy to convince people to use e.g. INVALID_VALUE instead of INVALID_OPERATION. Sometimes two errors that could come up in one test.
JG: think some of this comes from some of those battles I lost. INVALID_ENUM instead of INVALID_OPERATION for example. Now have to hold on to some error checking code though it doesn’t help anyone. We give you useful errors in the text already. In WebGPU we’re talking about giving the general app code those error strings and I think that’s the main benefit.
CW: maybe we need someone to come up with a suggestion that’s focused on the CTS and doesn’t add error codes.
JG: I’d like to see an example of a corner of the API where people put together detailed error codes. Might be useful exercise.
KN: one thing we get out of not tightly spec’ing these: becomes flexible, we can change it more easily.
MM: one method for reaching consensus: the other sides can be asked if they could agree to the other proposal?
CW: I think on Chrome’s side we’d be okay. We’ve been doing a lot of the validation tests so far. We do test on Safari. I think we’d be okay not having error codes.
DM: thought the conclusion was, we’ll have error codes, just not part of the API?
CW: wasn’t agreed upon.
JG: we haven’t agreed on that yet. Would be tolerable to have error codes, but I would prefer not because the cost/benefit doesn’t work out in our favor. Not saying that nobody should generate any error codes. If they’re useful I think we can make sure they’re possible to validate.
MM: before we close: in this conversation I’ve been describing how the impl cost of these is high. On the other hand, would like to make the point, if the group decides on fine-grained error codes, we’re willing to implement them.
CW: ok, every side is willing to accept the other one. :)
KN: I think we should put together a proposal for making these optional and not part of the spec, maybe a separate document if we want the CTS to give better coverage in our browser.
CW: onus is on Google since we’re the one who suggested this. Also provide sample pull request to see what it would look like in the spec and CTS.
CW: noted: either way is fine for everyone.

setSubData and friends

#418 setSubData
JG: since the last time: had proposal for how you might have encoder.copyDataToBuffer function / polyfill. Could decant onto the prototype if you want. Fairly straightforward. Kai pointed out, destroy works differently than it assumes. So it makes most sense on the Queue. Nice to put on encoder, but you can make it scoped to a single function if it’s on the queue. That’s my proposal - here’s an easy way, beginner-friendly. When they pull in matrix helper functions, pull these in too.
CW: so Queue.setSubData or similar?
JG: yes, but as polyfillable option. Not something in our API, but something easy to drop in to our implementations.
RC: would this have same perf as if we did it ourselves?
JG: there are some things we can do as browser implementers that could yield better performance if we guess the right thing the user’s trying to do when they call setSubData.
CW: in our impl it would be hard to optimize createBufferMapped in this way. The mapped data is allocated in shared memory, hard to optimize out.
JG: that’s the same thing you’d be doing in setSubData.
CW: createBufferMapped can be unmapped whenever. setSubData is always retired in same order. Hard to do chunked buffer when you don’t know how long it will live. Risk fragmentation.
KN: with setSubData can do it inline, in the command stream.
MM: on iOS, buffers can be persistently mapped under the hood but not part of WebGPU API. At least on iOS a copyDataToBuffer that’s native could be a memcpy, no allocations, and the polyfill would need to allocate.
KN: depends on which polyfill you look at. First one, yes, does allocations. Second one, could be a memcpy CPU-side.
CW: that said, having setSubData on the queue could make more sense overall.
JG: to be clear, I’m not proposing the form the API would take in our impls. This is the form I suggest for our polyfill. Depends on buffers’ lifetimes. Enqueuing the copy doesn’t keep a strong ref to buffers. That’s why it’s on the queue in this polyfill. If you want it more sophisticated version, I wrote one, linked from Kai’s proposal. Chunking during uploads, upload pools, etc. They can implement this on our own and I think they should do that.
JG: one point MM made: could persistently map, then write in directly. Would like to see some counterexample written down - how a polyfill falls short. My proposal here fulfills the original pull request, but I’m receiving feedback that it doesn’t.
CW: The naive polyfill showed: could have very bad performance.
JG: That’s okay.
DM: best case that MM mentioned will only happen when GPU isn’t using that range of memory. Would be covered by buffer views. Why do we need a special API?
MM: do we have buffer views yet?
DM: no, but that was the plan we made.
JG: DM’s right, this does get better when you have sub-buffer-range resource tracking, and can map part of a buffer.
MM: I’m happy to defer this topic until we discuss buffer ranges.
JG: OK.
CW: to understand: we’re suggesting to defer this topic because one concern is that this polyfill has worse performance, and buffer views might fix it?
DM: no. setSubData was never the optimal path. Not sure why we’re discussing this.
JG: fear of mine is that we’d be forced to continue optimizing setSubData and guess the users’ intent. This is a big divergence from the openGL style APIs and I’m concerned that we’ll be forced to maintain and continue to optimize this.
KN: I think that setSubData is actually very performant for small amounts of data. Small memcpys are cheap and we can do them inline (in the command stream). Easy to handle lifetimes. Would be nice if we could optimize that kind of case with its replacement.
JG: one thing: this is similar to what does exist in some APIs, where you can enqueue small buffer updates inline in command buffers. Maybe should revisit that.
KN: good idea.
MM: think the semantics of what Jeff just described are the same as that original proposal. If we put a size restriction on this it would match the use case.
KN: I think it puts it in the cmdbuf.
JG: think we’re iterating toward that solution. Need lifetime guarantees.
KN: there’s significant value to putting it in the cmdbuf. Lets you do updates in the middle of things.
JG: can look at size restriction.
MM: is “small” a few bytes or enough to upload MVP matrices?
JG: matrices.
KN: would like to seriously consider that.
CW: more work offline.
JG: I’ll take that AI.

Advanced feature investigations

https://github.com/gpuweb/gpuweb/issues/380 Bindless resources
https://github.com/gpuweb/gpuweb/issues/431 ExecuteIndirect
https://github.com/gpuweb/gpuweb/issues/435 Managing on-chip memory
https://github.com/gpuweb/gpuweb/issues/442 Programmable Blending
https://github.com/gpuweb/gpuweb/issues/445 Tessellation
https://github.com/gpuweb/gpuweb/issues/450 Variable Rasterization Rate
MM: currently in process of discussing with teams internally to figure out the most important of these.
CW: think these didn’t get enough eyeballs to indicate that people had looked at them. Folks, please do.
MM: wanted to investigate things that all 3 APIs have facilities for, even if they’re optional, so there’s a path forward for them. Managing on-chip memory has a path forward that wouldn’t be optional. Subpasses approach is one option we could go. If we added it, it would be part of the core API like in Vulkan. Just you may or may not have it go faster if you use it. That’s the only one that wouldn’t be an extension proper.
CW: thanks. Looking forward to hearing more about these. Please everyone look at the API suggestions.

Workflow

DM: wrote down the original version of the wiki page. Tried to describe the way we’d track progress on issues across different stages of dev pipeline, reflected in the Github project. Ended up with design specification testing. 3 steps outside of the … one. Rules about advancing them.
DM: main thing: assignee. Once an issue tracking something is transferred to next column / stage, would deassign person. Then could see all issues in this category that are unassigned. Once someone starts working on that stage they’d assign themselves.
DM: Kai and Justin updated the doc too.
KN: we didn’t change much. Pointed out that it’s possible for tracker to see what’s assigned / not. Easy to see overall project view. Find ones without assignee. I think in some cases people would stay owner for issue through at least 2-3 steps of it. Case-by-case, figure out whether to unassign somebody. Also, took bunch of issues, put them in tracker. No big changes.
CW: I assigned myself to destroy semantics. Think this is great to do better project management. Put this project in more permanent fashion, in the README?
DM: OK.
CW: looks amazing, thank you.
DM: still experimental, going to see if it works out. Looks lively as issues have been populated.

PR burndown

#411 - should close.
setSubData PR: should skip.
#433 - make sure popErrorScope reports error.
- AE: think it’s fine if it’s operation error. Unless trying to distinguish, errors don’t matter that much. Should be consistent.
- KN: should we update this PR? Spec should say OperationError. Also shouldn’t both throw/reject. Could close this and make another one.
- CW: everyone OK?
- JG: yes
- RC: yes
- CW: ok, great.
uint32array overload for bindgroups
- AE: have to rename something. Didn’t see the comment.
GPUShaderTransform - needs its own discussion
#469 - renaming of mental model for vertex input.
- KN: only one API change - one member became required. Rest is member renames, non-user-facing dictionary renames, and spec text.
- KN: byteOffset attribute is no longer optional. I think this is imp’t to the memory model.
- JG: vertex structure member descriptor - structure member is something we talked about. Did we conclude that discussion?
- KN: we didn’t conclude it. If you don’t like it please comment.
- JG: I sort of like VertexInputDescriptor, if you invert all of these. IT’s talking about what the input to a variable is in the shader, which isn’t necessarily a structure.
- MM: can we use language that’s common to existing APIs? Most call them attributes.
- KN: question becomes, what is an attribute. A place where data is fed into the shader? Or the strided data in the buffer? The way I named it here, attribute is the thing in the shader: shader location. Attribute != data layout.
- MM: right.
- JG: I think this is OK. I like vertex attribute better than vertex structure member.
- KN: I think with the text that explains what that means it’s fine.
- JG: I think it’s imp’t because they’re not really structure members.
- CW: they can alias, too.
- JG: in GLSL, they’re just inputs.
- KN: right, they’re not structures in the shader, they’re structures in concept.
- MM: it’s common to set up an array of float4s, no structures involved.
- JG: it’s structures conceptually, not in the code, and that distinction I don’t like.
- KN: I will update the PR.

Agenda for next meeting

MM: what about the PR getting rid of vertex fetch?
- CW: #411. Think we should close it.
- JG: I sort of like it.
- KN: was up to DM, he still liked it.
- JG: discuss it next week?
- MM: if we’re going to discuss it I can make a Metal example.
- KN: if you have feedback from hardware people that would be helpful.
- MM: I’ll do my best.
CW: #469 + #411 vertex input stuff.
KN: buffer views?
- MM: should do it sooner rather than later. Need someone to champion it.
- CW: I can do that. Discussed at length with DM, Markus back when we talked about buffer mapping initially.
- KN there’s also an issue with a lot of discussion on it.
  - https://github.com/gpuweb/gpuweb/issues/156
- CW: I can make an explainer. Don’t want to write spec for it yet.
CW: setSubData and friends? Or rename it immediate data upload?
MM: I’d like to prod JG to make progress on multi-queue.
- JG: noted.
DM: should we revive the push constants discussion?
- JG: I will be looking at that as an AI for immediate data uploads.
- MM: excited to see what you come up with.
CW: there are a lot of AIs here, not sure how many will be done by Thursday.
JG: could we get an AI from Myles to get 1-2 of his investigations to promise to look at next week?
- MM: image blocks. Contentious, but hard to argue with 40% perf improvement.
- JG: programmable blending?
- MM: I have some ideas, could bring them up during the call.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 2019 10 21

GPU Web 2019-10-21

TL;DR

Tentative agenda

Attendance

Error codes #464

setSubData and friends

Advanced feature investigations

Workflow

PR burndown

uint32array overload for bindgroups

Agenda for next meeting

Clone this wiki locally