GPU Web 2022 03 02

GPU Web 2022-03-02 (APAC)

Chair: KG

Scribe: KN/KG

Location: Google Meet

Previous meeting: Agenda / Minutes for GPU Web meeting 2022-02-23

Tentative agenda

(Intel) Expose optional texture capabilities in WebGPU #2630
Discuss the WebGPU Adapter Identifier explainer before reaching out to the PING.
Limit for the maximum buffer size #1371
[Process] Extensions directory in gpuweb/gpuweb #2301
[Process] Requirements for additional functionality #2424
Streaming implementations and indirect draws/dispatches #2189
Cannot upload/download to UMA storage buffer without an unnecessary copy and unnecessary memory use #2388

Attendance

WIP, it is the list of all the people invited to the meeting. In bold the people that have been seen in the meeting:

Apple
- Dean Jackson
- Myles C. Maxfield
- Robin Morisset
Google
- Austin Eng
- Ben Clayton
- Brandon Jones
- Corentin Wallez
- David Neto
- Gregg Tavares
- Kai Ninomiya
- Ken Russell
- Loko Kung
- Shannon Woods
- Shrek Shao
- Ryan Harrisson
Intel
- Brandon Jones
- Bryan Bernhart
- Hao Li
- Jiajia Qin
- Jiawei Shao
- Jie Chen
- Shaobo Yan
- Yang Gu
- Yunchao He
- Zhaoming Jiang
Kings Distributed Systems
- Daniel Desjardins
- Hamada Gasmallah
- Wes Garland
Microsoft
- Jesse Natalie
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jim Blandy
- Kelsey Gilbert
Unity
- Brendan Duncan
- Dave Shreiner
Andrew Varga
Connor Fitzgerald
Eduardo H.P. Souza
Francois Daoust
Francois Guthmann
Jeremy Sachs
Joshua Groves
Matijs Toonen
Mehmet Oguz Derin
Michael Shannon
Nathan Johnson
Pelle Johnsen
Rick Battagline
Rob Conde
Timo de Kort

Administrivia

CTS Update

KN: Good amount of work going on. RH and AE especially doing great work, as well as contributions from Intel.
MM: Ran the CTS for the first time, working on integration.
KN: There’s a lot to talk about WRT integration. Might want to have a live chat about it.
KG: I’m also interested.
MM: Should we have this in public?
KN: Yeah, but don’t want to bother people who aren’t interested.
MM: I can start a mailing list thread, and people can ignore if they want.
KN: There’s enough to talk about that we probably want to have a meeting at some point. fwiw.

“Tacit Resolution” Queue

Implicit pipeline layouts #2550 #2470
- Editors (now just Kai/Brandon) don’t feel it’s worth the effort to remove GPUPipelineLayout object. Would still like to introduce partially-automatic pipeline layouts, but don’t feel it’s necessary for V1. Want to make one breaking change: make automatic layout explicit (“auto”) instead of implicit (optional dictionary member).
- MM: Does GPUPipelineLayout correspond to anything in D3D?
- AE: It sort of does to the root signature, but only imperfectly.
- MM: But there’s other stuff in the RS, yeah? (yes, thus imperfectly)
- MM: We probably shouldn’t have this bake step if it’s generally unused by native APIs.
- AE: We do use it, but I don’t know off the top of my head why/for what.
- BJ: It’s not that we think we should keep it, but rather we don’t think it’s worth changing this late in the process, and I think it’s a fairly minimal wart.

Expose optional texture capabilities in WebGPU #2630

Intel
JS: Request comes from Chinese WebGPU developers who tell us one of the biggest differences is in optional texture format capabilities, for example rgba32float multisampling, useful for HDR rendering pipelines. In their WebGL engine, it works because WebGL has a query API to check for support - and it does on all desktop GPUs. But currently all texture format capabilities are hardcoded, and we can’t support [in core] because of old iOS [and Vulkan?] devices.
JS: We can expose an extension (feature) for this, but wonder if there’s a better way, like request in the device descriptor, just like for higher resource limits. And/or provide a query API to determine the capabilities on a particular device.
KG: Because we support rgba16f support for this, I think I’d want to hear more compelling reasons for desire for rgba32f, if someone wants it in v1 instead of post-v1.
KN: I think there’s two aspects here, both probably post-v1. One is e.g. rgba32f specifically, and the other is how to generally expose per-format capabilities. For now, I think we’re better served by having very narrow Features per format. Even if this becomes a soup, as long as we move past this eventually, I think that’s pragmatic for now.
MM: Not just old iOS, but old devices with new iOS. (Old iOS is not a problem because those devices can’t get WebGPU at all.) Fact that this is a device-specific capability naturally indicates that this should be an extension. Group hasn’t decided on a way to handle a bunch of small extensions, but agree with Kai it’s OK to have them for now. Would like to scoop them all up into a feature level later on. Could do even better and not ship the small extensions, but do the feature-packs from the beginning, but don’t want to force people who want one feature to wait years for everything else in the pack to get settled.
JS: OK with us, will communicate with those developers and find out how urgently they want an extension for this. Just revisited the [Metal?] table and saw that rgba32float is supported on [many devices].
MM: rgba16float is also very useful for HDR rendering and is available in core. Almost every usecase should be able to use it (and it should be significantly faster because the data is half as big).

Discuss the WebGPU Adapter Identifier explainer before reaching out to the PING.

MM: Had some extra internal discussions and sort of came to the conclusion that we don’t think this discussion is going to be very fruitful. Given the positions/discussions we’ve had in the past we don’t think discussion is going to change anyone’s mind. Looking increasingly likely that there will be no WG resolution about what information will be exposed in different browsers. Under that assumption, this is left up to the UA, and the specific form of this becomes a lot less contentious. Given that, we feel that the direction this group is moving is probably acceptable as long as we can not expose information on our UA since we don’t think we’ll all make the same decisions on privacy.
MM: On this proposal: Would like each piece of info to be requestable individually. Also not sure how we would actually implement this - get a string from the driver, how would we get the device info exposed in this proposal?
KG: Similar to some of the technical feedback I had also. We do some work to determine what device more specifically, but in WebGL often we do things by string matching and try to do OK. Reading the language here, it wants me to try harder than I’d like to to extract things like the architecture. Would like to make the best effort but not have to do it perfectly.
MM: Can be a way for the browser to say “I’m not sure”.
KG: Going to have to either lie and pretend we’re more confident than we are, or say “I’m not sure” a lot. In practice likely to do the first.
MM: Don’t think the solution is to go back to a string, because of the feedback Rafael has given in the past. But think that the UA should be able to intentionally decline to provide any piece of info in the struct. Optional/null or some other mechanism.
BJ: Excellent feedback, thank you for having these discussions internally, and getting to a point where we can move forward with a general direction. As to the fields in this proposal, it is aspirational. On the Chrome team’s part, we’d like to take a stab at it and iterate and bring back some feedback based on that. Definitely loud and clear from the beginning of this design that everything should be up to the UA, and have no intention of breaking that. Exact formulation in flux as far as I’m concerned.
BJ: Re: requesting items individually: Don’t mind that pattern, happy to take a stab at it. Thing that tripped me up in the past was I wanted to ensure there is a definitive breakdown for developers for whether a consent prompt will be shown or not. Felt awkward to do in something that queries line by line, but happy to take another look at it. Would be especially nice to move forward with idea that we can append to the structure over time as we find it possible to provide more information to developers. Again thank you for having those discussions and bringing that feedback, very appreciated.
MM: Technical way to get at what you want is probably for the developers to list the exact pieces of information that they want. Similar to location, etc.
BJ: Related to navigator.getHighEntropyValues, but I need to refresh myself on that. Reasonable way to move forward.
MS: Would it be possible to require the vendor? Everything else we can kinda work around. But knowing the vendor is something we’d like to always be able to rely on.
MM: Absolutely not.
BJ: Actually something I wanted to ask. Myles: My assumption was on Apple GPU devices the vendor would probably come back as just “Apple”. Is simply reporting that seen as something Apple doesn’t want to do?
MM: Gets at something I was saying, for example if on an Intel device don’t want to have to lie and say the vendor is “Apple”.
BJ: Get desire to not lie. On mobile side expect people will fall back on traditional fingerprinting libraries to guess, iPhone 6 or whatever based on screen resolution.
KG: Under certain definitions of the word “lying”, Firefox goes out of its way to lie, e.g. reporting a similar but different GPU from the same vendor/architecture. It’s lying to the answer of “what GPU am I” but it’s honest to the answer of “what GPU should you treat me as”. Still, get bug reports, would like a better approach. In the past, people have exposed things like “2060 or similar” instead of “Titan”.
BJ: Think that’s a valid pattern, think the hope here is we can pick categories of information we can return that work - the intent behind the “architecture”. Obviously there are challenges to doing so. But I would hope we can get to a point where anything the UA sees fit to fill in it can do so as accurately as possible. But also cases, especially in privacy-enhanced mode, there may be cases where it is seen as more privacy-preserving to simply state a generic bucket.
MM: If the question is “how should the website treat the device” (Brandon used the word bucket), and these are behavioral differences like threadgroup size, buckets could be based on behavior rather than vendor. For us, would be way way better.
MS: Would half solve our problems. Would not completely solve it because we have to run different shader code on Metal, at least currently.
KG: Speaks to one of the goals I’m trying to satisfy. Myles, I think that’s a great idea for answering known questions, but it doesn’t help with zero-day solving unknown questions. If there are bugs we don’t know about, one of the serious values of giving them more info than we know they need, is that they can respond to it directly. Handling things we don’t realize we are going to need to handle.
BJ: For me, to reiterate, the exact keys we want to expose here are very much in flux. Think there’s value in exposing info like what Myles was just talking about. At this point, iterate and see what’s most reasonable to expose on the architectures we’re working on. And continually bring it up with the group as we work on it.
KN: Still taking something to the PING?
BJ: Still room to make some iteration, but then should be at a point where we are comfortable doing that.

Limit for the maximum buffer size #1371

KG: Interesting new info here
KN: Seems to be an undocumented
MM: The team basically said to me “well you asked for a limit and we gave you a limit, soo?”
KN: Looking at the API, it looks like this is something that can change across MTLDevices, so I don’t know if we can choose one value here?
MM: MTLDevices are static globals. So if you ask for a device twice, you get the same pointer.
(KN/MM banter about that)
KG: Easy question to ask: Do we have any consensus to add a limit here? Probably at the stage where we can open a PR.
RC: Is the limit assumed to go down as you allocate more memory?
MM: No. Number doesn’t change throughout the lifetime of a process. May be different when you wake up in the morning.
KN: Seems difficult to work with, but I guess native apps manage.
MM: Won’t differ hugely.
KG: Worry people are going to assume it’s aligned on a MiB boundary or something.
MM: Did this research to understand fingerprinting. If there’s a wide spread, difficult to bucket, we should make it an error. If there’s a small spread, indicates we should have a limit.
KG: To me buckets still work if there’s a larger spread.
MM: Question is how much you’re wasting. If you have too few buckets then you’re making developers lives significantly harder by exposing numbers that are too small.
KG: Is that making life harder, or better to … ask pipeline …
KG: Two classes in my mind: out of memory, and Just Too Big. Signal different things to app and weird to put them in the same channel. Given something people probably want to know a priori, rather than trying to allocate to find out how big of a resource they can use… depends on structure of app.
MM: Something to consider is how likely people will run into this.
MS: Think it’s very likely.
KN: I do think we need to make a distinction between out of memory and hit-a-limit. I want OOM to signal that an app should try to free up memory and could conceivably succeed if they try again after freeing.
KN: I don’t think it’s better to bury the limit at allocation time, instead of being up-front about it, even if bucketed/wasteful. I don’t think we’re going to waste too much allocation size.
KN: I also don’t see this as particularly different than e.g. max uniform size.
KG: Point I’m at is I think it would be better if we gave this to users and think it’s generally worse to not have a limit up front. But not going to fight about it. [Prefer limit, think error is tolerable.]
MM: Think I agree with Kai, response for OOM vs just-too-big is very different (free up memory vs split up buffers).
KG: Will put in a specific proposal to add a maxBufferSize limit, for people to thumbs-up or thumbs-down.

[Process] Extensions directory in gpuweb/gpuweb #2301

[Process] Requirements for additional functionality #2424

Streaming implementations and indirect draws/dispatches #2189

Cannot upload/download to UMA storage buffer without an unnecessary copy and unnecessary memory use #2388

Agenda for next meeting

Next meeting: Wednesday March 9 noon-1pm (America/Los_Angeles)
[Process] Extensions directory in gpuweb/gpuweb #2301
[Process] Requirements for additional functionality #2424
Streaming implementations and indirect draws/dispatches #2189
Cannot upload/download to UMA storage buffer without an unnecessary copy and unnecessary memory use #2388

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU Web 2022 03 02

GPU Web 2022-03-02 (APAC)

Tentative agenda

Attendance

Administrivia

CTS Update

“Tacit Resolution” Queue

Expose optional texture capabilities in WebGPU #2630

Discuss the WebGPU Adapter Identifier explainer before reaching out to the PING.

Limit for the maximum buffer size #1371

[Process] Extensions directory in gpuweb/gpuweb #2301

[Process] Requirements for additional functionality #2424

Streaming implementations and indirect draws/dispatches #2189

Cannot upload/download to UMA storage buffer without an unnecessary copy and unnecessary memory use #2388

Agenda for next meeting

Clone this wiki locally