Minutes 2021 01 25

GPU Web 2021-01-25

Chair: Corentin

Scribe: Ken

Location: Google Meet

Tentative agenda

Administrivia
Pluralize GPU.requestAdapters #1314
Query API: What is the GPU timestamp unit on Metal? #1325 (really asks whether the timestamps should be converted to ns on the GPU or by the application)
Rename GPUVertexFormat members to match GPUTextureFormat #1322
Agenda for next meeting

Attendance

Apple
- Dean Jackson
- Myles C. Maxfield
Google
- Austin Eng
- Brandon Jones
- Corentin Wallez
- Kai Ninomiya
- Ken Russell
- Shrek Shao
Intel
- Yunchao He
Kings Distributed Systems
- Hamada Gasmallah
Microsoft
- Damyan Pepper
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
Andrew Varga
Michael Shannon
Mehmet Oguz Derin
Timo de Kort

Administrivia

Another VF2F?
- Yes, but (MM:) with two weeks notice
- JG: Plan for 3 days, and cancel the third if we don’t need it?
- CW: OK, let’s see what people say in WGSL meeting
Preparing TAG review
- CW: issue 1321: list of explainers we think the TAG will be interested in. Lot of quirks in WebGPU, lot of aspects that could use explainers.
- If you feel comfortable writing one of these explainers, please assign it to yourself.
- MM: we’re going to have 18 different explainers?
- KN: think it’s easier to write them as separate documents to begin. Might end up being one for most of the stuff, one for buffer mapping, etc.
- MM: are we sure the TAG’s going to go into that much detail?
- KN: don’t envision these going into that much detail. Think they’ll mostly be short. Also, if we look at amount of detail per unit API surface, think it’s a reasonable target. Don’t want them to be long.
- BJ: my experience is TAG is willing to read over the whole explainer. They may not deeply grok the technical terms you refer to, but they pay close attention to the API surface it implies, and how it interacts with the backends. Would expect reasonably detailed review of any material we produce.
- KN: agree. It’s on us to figure out the right level of detail.
- JG: is this to get ahead of any recommendations they have, and why we decided to not do them?
- KN: that’s one thing I want to make sure we cover.
- CW: we can discuss more about grouping, etc. Explainers you think should be grouped together. Discuss on #1321, group them here. Should also have a section in the WebGPU spec, motivation for WebGPU. Not explainer, but additions to the spec.
- KN: privacy one should also be in the spec.
CW: our web standards folks said: when you want to ship a web standard, take what you have, cut it in half, and then ship that. Usually the target is too big. Maybe WebGPU is already too big to ship as is, and we should try to limit the scope.
- JG: think the normal web API things don’t apply to us. We’re writing a HW accel graphics driver for the web. Different from a payments API for example.
- MM: rather than talking about cutting the API in half, let’s talk about “which half”. Need a proposal. Not just “let’s start deleting things”.
- CW: agree. Maybe multithreading. WebGPU will be the first to be doing so many multithreaded things. Also has to work with Wasm. Requires going to Ecma. Not useful if you can only shallow-clone objects across workers. Shared object tables? etc. Like WebGL introduced typed array buffers, WebGPU may be introducing multithreading to JS.
- KN: agree. Think MT is hard. Can solve most app use cases with minimal application of MT. Maybe just transferable buffers/textures? Solves bunch of peoples’ use cases. Start with nothing, then address particular imp’t use cases, then do fully general thing. Pretty confident that we won’t break anything while trying to do that.
- CW: that’s one example of a feature we’d propose to delete from WebGPU V1 to propose both as a standard and implementation.
- MM: think that’s reasonable. Start from use cases, going from there. Legit design process. For this example of cutting MT, the use case we hear that’s most imp’t: not populating buffers / arrays, but recording drawing commands on other threads. If we had just that, it’d probably be enough.
- KN: right, forgot about that case. Think we can reduce down what we need to support in the API for just those two things.
- CW: maybe we should have an agenda item, what we could remove from what we have.
- JG: don’t want to pre-compromise before going into negotiation. They’re not going to just say “this is great”. They’ll have recommendations.
- KN: don’t see this as a compromise for the TAG, but for us.
CW: The big render pipeline descriptor refactor has landed. Can nitpick on it.
- MM: I thought every person said that what we had today was better.
- KN: not the case. One was the old, two were new, and we agreed on one of the new proposals.
- CW: this was 2 weeks ago.
- MM: I’ll go read the minutes.
- KN: we’d like people to look at what’s there now and make sure they’re happy with it. Fairly stable now, don’t have to change it again.
  - https://github.com/gpuweb/gpuweb/pull/1352** **

Pluralize GPU.requestAdapters #1314

KN: instead of returning 1 adapter, returns array of adapters. Doesn’t change anything else. Nothing changed related to requestAdapter options.
KN: rationale: API surface is more expressive. Allows more UA behaviors than before. UAs can make their own decisions about what to expose. API’s flexible enough that differences in behaviors between browsers won’t cause issues. Most hardware only has 1 adapter anyway. Each app has to work in case where there’s only 1 adapter.
MM: worried about: browser A receives request, with criteria, and returns list of all adapters that match the list. Say N adapters. Website is developed using that browser. JS that the developer writes iterates over the list, assumes it can pick “best” one. Then that code’s run on a browser that doesn’t return list of all matching devices. Suddenly the wrong thing happens. I understand that the new proposal adds flexibility. Worried about: don’t want web developers to write code assuming they can get a list of many / most / all devices on the system and pull out the one they want. Want them to make a request of the browser, get a response back, and use that directly.
KN: can you be more specific about how it’ll break?
JG: example: app expects to be able to iterate through list and find low-power GPU by asking for “give me the adapters you have”. Then run on browser that only returns best match. Since you didn’t make a good request we give you a random adapter. App doesn’t do what you want anymore. Code only works in case where browser gives you all adapters.
MM: philosophically pluralizing this function isn’t a problem by itself. TO make it plural, it’d have to take N requests and return N adapters. Instead of calling the function in a loop, you’d just pass all requests in at the same time. A middle ground.
KN: I want to claim that this PR already addresses that concern. It says, the value of powerPreference may influence the order of adapters chosen, but can’t remove GPUs from the returned list. It can’t filter based on that.
JG: solves the problem by saying that a browser can’t not give you all of the devices.
KN: right, all of the devices it could give you. No matter how many times you call requestAdapters, it should return the same set - unless for example you unplug a GPU. If it wants to limit a page to the integrated GPU it should do so all the time. Or see what the powerPreference was for the first call, and only return GPUs matching that. (I don’t think that’s a good idea, just saying it’s possible.)
CW: so - browser decides list of adapters it wants to return to the page. Upon requestAdapters, can only order the result.
KN: right. powerPreference changes the order. If we had “allowSoftware” or similar, that could change whether item’s in the list.
CW: how does it solve the browser interop issue? WebCompat issue?
KN: think it solves Jeff’s example. Premise: we’re not concerned about code written on multi-GPU system not working on single-GPU system. If you test on hardware similar to 95% of the internet, onus is on you for that. Hard to write code that does the wrong thing. For Jeff’s case: app doesn’t use powerPreference, wants to look at whole list of adapters & use their own heuristics - we don’t let that be a problem because we say the set won’t change. powerPreference only affects the order. Even if you don’t pass in powerPreference, it doesn’t matter that you passed in the wrong powerPreference.
MM: seems non-forward looking. Surely we’ll add filtering options?
KN: yes - we have filtering options - think that we only add options that add things to the list, and not things that remove them.
CW: compatibleWithXrDevice seems it’d remove things from the list?
DM: or being able to present in general.
CW: or compute-only.
KN: everything should be able to present. Xr compat would be a filter option. How would the app implement that themselves? If they try to do it themselves - don’t think it’s possible with Xr compat.
RC: for Xr compat, seems we should have API on XR side saying which to use.
KN: yes, think so.
RC: for filtering - will have analogous to “failIfMajorPerformanceCaveat” - that one can filter as well.
KN: yes.
RC: that even can filter out all adapters down to 0 available.
CW: sounds like on one side, we want to expose all adapters, based on context, for dev flexibility - but want this to not break when the browser changes its behaviors. Any other way to bridge this?
CW: why not have both? requestAdapter, as today - leave it. requestAdapters - your page needs to be in a higher state of trust. Secure context + add’l trust. Unless you jump through many hoops, your page has to support requestAdapter. If you jump through many hoops you could use requestAdapters.
MM: sounds not terrible. If going in this direction, imp’t for us to define the hoops. Impl can’t just define 0 hoops.
CW: permission API for example.
JG: think we should have this in our minds. Too delicate APIs, don’t want to expose to drive-by or ads, put them on a list. Apps that want much closer to native parity in functionality, even if it’s a little dangerous. There are sites users do trust that much.
MM: willing to entertain the idea. How about: keep what we have - singular option - and come up with secure context notion, and the hoops you’d have to jump through.
JG: I put up a SecureContext PR.
KN: it’s good - but there should be more hoops.
KN: thoughts:
- 1. if we have both, maybe requestAdapter - give it info so it can select for you, but requestAdapters would take less info. Remove powerPreference, never add filtering options, only add options that put more things in the list. Prevents issues like Jeff’s example. For anything where you want filtering, put metadata on the adapter.
- 1. maybe if we have both requestAdapter & plural, maybe we don’t need singular, but just navigator.gpu.requestDevice. Requires a lot more selection. Need list of features & limits. OK with UA selecting adapter based on that list? And the app being stuck with initial features / limits?
MM: kind of like that idea. Like the idea of taking the set of adapters exposed to the page - intermixing creation of that set with requests that the site makes. “This is the most imp’t, please give me this set of criteria”. Next set of criteria - next imp’t. “You have a budget of 2 devices - that’s all you get”. Then give no more new stuff. Intermixing requests with creation of set of devices sounds powerful.
BJ: value of flipping pattern? Pages will always need to be able to handle standard requestAdapters. Maybe it can take an adapter ID? Not a serial number, not persistent page to page. Have to query every time. But, without that ID - picks based off criteria. Otherwise, have function that enumerates them (without giving you adapters themselves), gives you more restricted info. Not full data dump an Adapter’d give you. Then you could budget out requesting 2 separate adapters, for example. Can’t get full fingerprintable data dump in that case. Maybe not worth it, should give them the Adapter anyway? Not sure.
MM: are you saying we’d have 2 sets of features, one to select the device, and then tell me more about what I’ve got?
BJ: a little bit. Is that two-tier system of information beneficial to anybody? Could give descriptors of the adapter. id = blah (randomized for session), and these properties. Then you pass that to requestAdapter. Enumeration function could take in some filters. Basically just an informational dictionary / interface. Only really makes sense if we feel there’s a good reason to parcel out the bits of information / feature set the people need to look at to decide what they want. Not sure if it’s useful.
MM: think we have a design like that already in adapter / device distinction.
BJ: you’re probably right.
CW: think Brandon’s proposal has two-tier fingerprint distinction. :D First, get very general, high-performance, can-present type information. Then, decide if you want to use this one. Now you get the limits/extensions. Finer grained fingerprinting in second step.
MM: worth considering. Coming up with what’s in the small & big sets is hard. Authors will complain. easy for this group to slide back into one big set.
KN: agree with that. Re: global requestDevice: could have something where “I require these features / limits”, “I would like these features / limits” - we could pick adapter / device based on that info. If you need more than what’s expressed by that, could use pluralized version.
JG: don’t think anyone’s made preferential chooser API that works.
MM: when third-party devs write code iterating the list, behavior of that code is “prefer these things, prefer not these things, prefer this order”.
JG: when you combine preferences like “I want RGB but only with depth/stencil”, hard to write.
CW: also have worries - “I want these things but don’t require them” - will work on your dev machine, and on machine B the things you want are required, and your app breaks. Think it’s imp’t to have strict opt-in. Too much of portability concern.
CW: good progress on this discussion. Let’s move on.

Query API: What is the GPU timestamp unit on Metal? #1325 (really asks whether the timestamps should be converted to ns on the GPU or by the application)

CW: Metal introduced timestamp queries recently, and there’s no specific unit to them. You’re supposed to put timestamps in places in your buffer. At beginning / end of buffer, get CPU times (command buffer gives you these at start/end), GPU times, and compute linear regression to get your timestamps. This is on non-Apple GPUs.
- MM: you’re supposed to write that code everywhere. It happens that on Apple GPUs the number is nanoseconds always.
- MM: also: when you associate CPU and GPU timestamps, they don’t have to bookend your query. Just supposed to do this periodically during lifetime of application.
- KN: is that because your precision won’t be that good? Or because frequencies of 2 chips are changing w.r.t each other?
- MM: the latter, to my understanding.
- KN: that’s rough.
- MM: yes, that’s why you can’t just do this at startup.
- KN: that’s rough, and I’m surprised the frequency doesn’t change faster than that.
- MM: yes. Provide the facade that the numbers we expose to the developer are in a well-known unit. In Metal, browser will add these query timestamp calls throughout lifetime of app, and do little compute shader thing to convert the values. Running the compute shader thing has to be a hard requirement - have to add quantization / rounding for timing attacks. Has to always be possible to run the compute shader thing.
CW: we discussed this being a developer-onyl API. Don’t need quanization?
DM: if you start to quantize as badly as other APIs on the web, won’t be useful at all. Just use executionTime property on cmdbuf itself.
MM: if options are quantize, or not have this API, that’s an OK choice and we can choose between them. I’d prefer to quantize but could also agree to remove it.
CW: do you mean not have this API in the spec, or in WebKit?
MM: in the spec.
CW: so quantization should be defined in the spec?
MM: no. Spec says it gives you a number with a unit, presumably ns, and impl can add noise to that if it thinks it should.
CW: how about: Hao implemented the compute shader, it’s an existence proof that it can translate the values. How about, when this extension’s enabled, device says “this is how you convert between this unit and ns” - can be 1.0. When browser has to do intervention / linear regression, could be 1, but on other platforms could be 83.3 for example.
MM: info that’s exposed, would be on device?
CW: or adapter.
MM: and a single JS Number?
CW: yes
MM: and devs would consult that number periodically thru lifetime of app?
CW: no. Constant for adapter. Multiply result they get from resolve every time. Suggesting we ignore the variable clock rates we can see from Metal, browser papers over it with the compute shader. So for Metal the constant should be 1. But on other systems that have the constant defined by the APIs we target, could just return that. Don’t think the browser should offer this to drive-by pages - “highly privileged”. OK if there’s an add’l compute shader on Metal and not the other APis.
MM: an app wanting to know whether it’s hitting its frame budget should be using other timestamps?
CW: yes, looking at execution time of buffer.
KN: or requestAnimationFrame performance.
CW: when we discussed timestamp queries - it’s high-precision timer, we do a lot of stuff to not expose fine-grained timers. To get GPU exec time, get from command buffer - coarser grained, can be quantized easily.
MM: if this is a dev API why are we talking about quantization?
CW: maybe we can not have quantization on queries.
MM: why discussing dev-only API? Part fo DevTools, not standardized.
CW: so people can use DevTools extension in a portable way. Flame graphs in their app in a portable way. Check box in Firefox devtools which enables that. Similar for Safari, Chrome. Were concerns about DevTools being observable though.
Discussion about in-engine profiling, built using timestamp queries. Would have to enable in browsers’ DevTools, but would be portable.
MM: think we discussed this months ago. Similar answer to then: don’t think we can excise dev features from main API. Natural to come up with DevTools API inside web inspector. For this object, give me an associated object that has all the details. Not part of the main API. Then open DevTools, browser starts gathering associated objects. Such a design doesn’t need to change web-facing WebGPU API. That’s how it works for 2D canvas for example.
KN: so would writeTimestamp be in core, but would be 0 unless DevTools enabled?
MM: why wouldn’t DevTools just add those?
KN: app-specific where you put them.
JG: don’t think it’s viable to auto-add. Commands looking like this, etc., would gather timestamps.
MM: would do it with breakpoints. The “action” of the breakpoint would be to start a timing query
JG: not excited about that personally, but see how it could be exciting.
DM: something implied - if we have compute shader, doing lot of precision timings becomes queued up on what GPU’s doing, switching between compute / transfer / graphics - it’s a concern for converting these values. Should go back & think about dev-only part of spec in a separate doc.
CW: there’s an issue in Needs-Discussion - should we have dev-only features? Can put this at end of next week’s agenda.

Agenda for next meeting

Rename GPUVertexFormat members to match GPUTextureFormat #1322
Require [SecureContext] for using WebGPU. #1363
Developer-only features?
Half of the meeting for WGSL?
- RC: can see.
- JG: we overflowed this meeting…
- KN: we should discuss the Needs-Discuss stuff here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minutes 2021 01 25

GPU Web 2021-01-25

Tentative agenda

Attendance

Administrivia

Pluralize GPU.requestAdapters #1314

Query API: What is the GPU timestamp unit on Metal? #1325 (really asks whether the timestamps should be converted to ns on the GPU or by the application)

Agenda for next meeting

Clone this wiki locally