GPU Web 2022 07 06

GPU Web 2022-07-06

Chair: KG

Scribe: KR / KG / KN / others

Location: Google Meet

Tentative agenda

Administrivia
CTS Update
“Tacit Resolution” Queue
APAC-member-prioritized topics not already on the agenda? #2766
Criteria for pushing the WebGPU and WGSL specification to Candidate Recommendation #3107
Specify that texture_external sampling functions clamp coords to [0, 1] #2766
Should objects have pointers back to their parents? #1497
Agenda for next meeting

Attendance

WIP, it is the list of all the people invited to the meeting. In bold the people that have been seen in the meeting:

Apple
- Dan Glastonbury
- Myles C. Maxfield
Google
- Austin Eng
- Ben Clayton
- Brandon Jones
- Corentin Wallez
- Dan Sinclair
- David Neto
- Gregg Tavares
- Kai Ninomiya
- Ken Russell
- Loko Kung
- Shannon Woods
- Shrek Shao
- Stephen White
- Ryan Harrison
Intel
- Brandon Jones
- Bryan Bernhart
- Hao Li
- Jiajia Qin
- Jiawei Shao
- Jie Chen
- Shaobo Yan
- Yang Gu
- Yunchao He
- Zhaoming Jiang
Microsoft
- Jesse Natalie
- Rafael Cintron
Mozilla
- Jim Blandy
- Kelsey Gilbert
Snap
- Corvyn Kusuma
- Dhritiman Sagar
- Sumant Hanumante
Unity
- Brendan Duncan
- Dave Shreiner
Connor Fitzgerald
Francois Daoust
Mehmet Oguz Derin
Michael Shannon
Rob Conde
Russell Campbell

Administrivia

CTS Update

KN: Doing end-of-quarter accounting. We have some metrics on what we’ve done. We have one thing that tells us how many todos we have, and one that tells how many tests we have.
- Making steady progress. Not very fast.
- Proportion between #TODOs and #Tests has gone down from 32% -> 21% in the last 6 months.
- It's something. :) Ratio is a heuristic, but it's a number we want to be 0. Reasonably good progress.
- Q1: 32->26%, Q2 26% -> 21%.
- 470 -> 590 tests. Decent. Lot of coverage in the new tests.
- Today - got Myles' email about generating separate HTML files for separate tests. Finally did this. Added build step that generates 1 HTML file for each test. Multiple HTML files per .ts file. Up for review!
- MM: does that plug into the partitioning of one test -> multiple tests? Just curious.
- KN: not yet. Not much effort to add it back in.
- MM: Excited! # test passes will grow by a large factor (1000x?) because createTexture test has 5000 tests.
- KN: this is for the handwritten tests. 5-10 tests per file, so not quite, but still a significant increase.

“Tacit Resolution” Queue

Allow depth textures with cube/array/multisampled types #3070

KG: said, "go ahead" last week.
KN: no editors meeting this week.
KG: mark editorial?
KN: I'll land it and remove the label. Actually, someone should review it.
KG: pretty sure I reviewed it. r+.

APAC-member-prioritized topics not already on the agenda?

SY: #2766 texture external clamping rule.
- KG: Will discuss 2 issues from now, below.
DG: I'm good - no additions.

Criteria for pushing the WebGPU and WGSL specification to Candidate Recommendation #3107

KG: last week we said "think about it". Thoughts? Don't have to make decisions now.
KG: in my mind - we're kind of at CR. Still have things to sort out and issues to fix - but not how strict we want to be about CR. If we are, then not much diff between Candidate Recommendation and Proposed Recommendation. We think, this is more or less a spec. Mostly need to implement, see if it's implementable. Seeing more issues come up that are impl-focused, which is good.
JB: seems WGSL has pretty major issues that don't have Draft in the spec. For example decision about builtin overriding. Would be nice to at least have some language in the spec. That's one thing I'd like to see in the spec before CR.
KG: that's fair - focused on milestone as V1 release. But V1 is sort of the proposal / W3C Recommendation. We have things to work on there - but need to figure out blockers for CR.
JB: do we have a milestone for CR blocker?
KG: will talk with Corentin. Reasonable? Don't want to over-optimize what goes in what milestone. Want to go through CR issues before PR ones.
KN: subjective threshold - not too easy - but also not that important. Good idea to think about this now.
MM: what's Github label meaning of CR / PR / V1 / post-V1? Worried we're adding process.
KG: issue here - W3C does have a process. Right now we're at proposal process. 2 stages between now and recommendation stage. Don't want to accidentally stay at Working Draft phase until we're ready for recommendation. A way to force people to think about whether we should fix this before we go to CR.
KR: The CR doesn’t have to be perfect. Agree with JB on builtin overrides being a large undecided issue, and generally how we move forward as a language. I think CW’s criteria captures this well, and I think that would be a non-editorial issue. Before things are fully polished, we can still say we have implementations, even though we’re not fully portable, but we think this thing is implementable, and let it bake there while we wait for implementations to catch up.
KG: want to think more, see what Corentin thinks. Part of this is project management. We'll figure something out with Github labels - as a group, let's think what we'd like to see before CR. We’ll come back to this next week.
KN: if we take the first criteria - spec's semantic complete - then I think what remains aside from spec being implementable are reasonably tested, etc. - think we have some editorial work before CR, but not a lot. Think we could take these suggested criteria and only worry about CR milestones for editorial stuff. Everything we've resolved upon needs to be written down in the spec. Editorial stuff, ideal to have before CR.

Specify that texture_external sampling functions clamp coords to [0, 1] #2766

SY: read last week's meeting minutes. Input from impl side: noticed we might have 2 passes. 0-copy and 1-copy. Current impl, 1-copy, browser will draw video to RGB texture. One-copy path, we'll have clamping during YUV->RGB conversion. Would like to ensure we define this behavior but don't have difference between 0-copy and 1-copy paths. Fallbacks for example, don't want different texture border behavior.
MM: what you said is generally correct. Might be observable with sRGB textures having gamma be applied after sampling instead of before. If someone really wants to detect 0-copy vs. 1-copy path they probably will be able to; but end users almost surely won't notice differences.
SY: concern is, using same API they'll get 2 different behaviors. User configurable address mode will have the problem. Relying on browser to supply it. More impl work in the browser.
MM: if texture address modes are user-controllable, it's definitely more impl work.
SY: can't rely on browser to have default behavior enforcing …
MM: misunderstanding? 1-copy path, video is copied into RGB texture. That operation can do colorspace conversion, rotations, cropping. Once done, any sampler can be used with it. We agree about that. 0-copy path, for that to work, sample operation has to do colorspace conversion, cropping, etc. Can't just use the naive sampler from the user - have to run a large block of code in the shader. Not that if you use 0-copy vs. 1-copy you'll get a black border around your video - the impl will handle it.
SY: makes sense. My point - 1-copy might crop more information. Clamp pre-sample (?)
KN: not sure I fully understand the concern. All of that code Myles mentioned in the shader will be there unconditionally. At pipeline / shader creation time, don't know if we'll need it. There's a branch between 0-copy and 1-copy paths. Going into builtin code to do the sample operations. Since we have full control over that, think there'll be no problem implementing the same behavior on both branches.
MM: think that's right. Last call, spent time talking about REPEAT addressing mode. Claim was: if you want to filter across repeating texture, has to be possible to filter, where texels are at both right and left edges - they abut each other. No texels in the source image where they abut when you have the black border around the image. What is it there for then, what now?
KG: one way to implement this - always sample twice.
KN: or 4 times, for corners?
KG: maybe.
KR: this is why I'm suggesting that we limit the functionality for the start. Let's start small.
MM: now that i understand more fully - videos rarely tessellate the way geometry does. Rare that each frame of the video will have right->left and top->bottom flowing into each other. Insetting by ½ texel - think we should do the same thing for wrapping. Miss ½ texel, will be fine.
KG: would rather not implement something knowing it will be flawed.
MM: flawed is interesting. If user wants to implement it they can't do better than our polyfill.
KN: the fact the user can do it is a strong argument that we don’t need to implement it.
KG: why are we giving them something with known caveats.
KR: user can do better than this if they know what’s going on. Problem is in order to implement these repeat cases you have to implement this bilinear interpolation yourself, which is known to be costly on GPUs.
- KG: True for repeat but don’t think it’s true for repeat-mirror
KG: State of agreement is(?):
- We should implement clamp
- Implementing repeat correctly with no seams requires extra sampling we don’t want to do.
- Repeat-mirror is implementable without extra sampling
- Question is whether we want to add repeat with seams to the API.
MM: mirrored-repeat definitely possible. 2 reasons why we've spent so long on this topic. One - we think tiled videos are a natural thing, should be part of the web platform. Other for API symmetry - if we offer samplers in one case, offering full flexibility of samplers in the other case eliminates special cases in the API. And there will be seams anyway.
KG: distinction to me - there is a correct way to do it that we know about, and we would knowingly not be doing it. Surfacing a real constraint of the API. If they want to use this thing, they have to make their own choices.
KN: we (Google) want to only clamp, and not give you control over mirror-repeat, etc. Just clamp. Anything else, do it yourself. Kelsey, do you want that? Or that + mirror-repeat?
KG: i'd be willing to compromise and offer mirror-repeat. No strong preference for it, just out of compromise.
KN: and Myles you want to support all of them, with seams.
KR: External textures are an optimization. Think it's reasonable to have constraints on their usage. They can do repeated videos with the API today, with one copy.
KG: Based on arguments so far, I’m happy to do clamp only. Clamp and repeat-mirror fine. Don’t think I’m willing to support repeat-non-mirrored with the knowledge the seam is going to be there. Seems like too much of a footgun to me.
MM: can we compromise on clamp and repeat-mirrored?
KN: I think we would, but we really don't think it's adding value to the API, when it's so easy to implement that yourself in your own shader. Especially given that many use cases will have data where they want more control over it anyway. I can't think of an example when you'd need mirrored-repeat videos. Texturing small resolution textures on a bigger model? Not a use case for external texture. Just put the small texture on the GPU. Hard to imagine use cases needing mirror-repeat. Only use case I can think of for repeat - 360 video and don't want seam at edge. But if we have seams, doesn't solve anything. Hard time thinking of why high-perf 0-copy path needs any of these modes. Seems unnecessary impl work.
KG: acceptable
KR: I don't see why we have to compromise on this. Easy to explain to users - this optimized code path has this limitation.
SY: From implementation side, agree keep it simple
KG: I see implementation complexity as not that much of a concern, though I haven’t implemented it.
MM: once you have clamping, mirrored-repeat is easy.
KG: think it's easy for users to do, and also easy for us to support.
KG: I want us to consider how strongly we feel about this to make forward progress.
KN: Myles why do you feel this is so important?
MM: reasons I stated earlier. Tiled videos should be a first-class citizen. If we compromise it'd be in favor of API symmetry (with non-external textures). If an author says, give me mirrored-repeat and we can impl it with no perf cost, we should give it to them. External textures / non-external ones.
KN: we're not going to do that. That's not one of the options. That requires us to introspect on what's in the sampler . This has to be an option on the external texture itself.
MM: in the issue there's a lot of discussion about how to implement this. Side-channel conveying information to shader; and changing APi surface of ExternalTexture.
KN: we're very strongly against side-channel for this metadata into the shader. It's really hard because of binding model limits.
MM: what about the other metadata? Normalized/denormalized texture coords?
KG: colorspace?
MM: right now when you sample external texture a lot of info comes along.
KN: yes but we need extra memory allocated, attached to external texture, that we fill with data about the uniforms which sample it. don't know about it until the draw call. lot of impl complexity, we are definitely not willing to do that.
MM: sample function that samples external texture should not take a sampler?
KN: not strongly about whether it takes a sampler - but should not have to know what the sampler is.
MM: from expressivity point of view. There's a set of samplers that can exist. What you're saying, only 1 point in that space is acceptable to sample from an external texture.
KN: no, the filtering modes can be used. Address modes can't be. Also, anisotropic filtering can be used. That's valuable.
MM: for filtering modes, depends on whether you want filtering before/after colorspace conversion.
KN: yes, and the info we've gotten from media folks is that difference is small enough that nobody notices.
KG: also not guaranteed by hardware. Sampling sRGB texture doesn't guarantee before/after gamma conversion. Potentially wrong, but in practice, people are OK with it. Use texelFetch for guaranteed results.
MM: for every member of the sampler, would like to see Kai's description of why it should/should not apply, then I'd like to come back to this next week.
KN: it's the filtering stuff, not the address/LOD stuff (no LODs). I can write it down. Post me a question you want specifically.
KG: it seems obvious to you but if you could write down the portions that would apply that would help.
KN: let me make sure it hasn't been stated.

Should objects have pointers back to their parents? #1497

KG: device->adapter, queue->device, view->texture.
KG: closed this because I felt this wasn't implementable - to keep the back-reference and not introduce real weakref behavior we'd have to keep strong refs from children->parents. Feels intolerable to me. Kai reopened it, said it might be valuable. Accurate?
KN: more or less. I didn't think the conclusion was foregone, wanted to discuss.
KN: I agree with the problem you posed. Might be technically possible to avoid it, but not easy. Don't think we should pursue it. So I agree we should close the issue.
JB: why's it difficult? Many of these objects we can't even use if we don't have a pointer to the device.
KN: difficult to guarantee that if you drop the ref to the GPUDevice, but holding on to random thing like BindGroupLayout in your app, the device is reachable. Therefore have to keep context alive. Yes, you should destroy the device, but GC should work better than that.
KG: concretely - uploading texture based on image. If you say HTMLImageElement.webGPUVersion = webGPUTexture; and then forget about it, the img will keep alive GPUTexture, GPUDevice, and we can't GC it.
JB: so we have unexpected forgotten references to API objects. No device, can't use them anyway.
KN: we would have the device.
JB: if don't have the links - we can clean up the device and clean up subsidiary objects as well, and that's the behavior you're trying to preserve. If people want to use expando properties, they can. I see.
KG: FWIW in Firefox WebGL children do not keep their parent alive. Don't know what we do in WebGPU.
BJ: was going to ask that. Currently, in Chrome, whether we have the parent backpointer, we keep device pointer in each resource we create. Valid to say, we should fix this problem and use weak-refs everywhere. We could. Does Firefox have this problem?
KG: I plead the fifth. :)
JB: right now we have a strong-ref to the parent.
BJ: very similar pattern to what we do in Chrome.
JB: if collecting parents is desirable - spec it, try to implement? We'll discover problems for V1.
MM: there are a few cases where we must keep parent pointers alive. Operations on default queue, and from object, can't get to the default queue.
BJ: example of those? createview?
KG: mapBuffer.
MM: mapBuffer's a good one. Discrete GPU, have to copy in/out.
JB: mapAsync?
BJ: yes.
MM: there are 2-3 such operations.
BJ: if behavior of assets not keeping parent alive is desirable - should have spec text saying this. But we can't guarantee that, based on tests?
KN: what prevents us from doing this?
…
KN: you can just keep the shadow copy alive - don't need to keep the device alive. Possible.
KG: if you don't have the means to keep it alive … nice to collect the device if you don't have a reference to it nor a way to use it.
KN: thought about this already because - have another open issue, when device's lost, doesn't interrupt app flow. Want to make mapping always work, even with lost / destroyed device. Give you an ArrayBuffer. Does it have right data in it or not?
MM: tricky. Timing of when Promise gets resolved is based on other work in the context.
KN: yes, tricky. Would like to not constrain the API, and not have strong refs. Weakrefs, maybe. Should punt on them if we consider them at all.
BJ: we should definitely not add API-visible parent refs. Should not mandate that you can't have strong refs. Impls might need them. We should try to operate off more a weakref model.
Come back next week

Agenda for next meeting

Shader stage input-output subsetting rules #2038
- Allow inter-stage variable subsetting #3122
Describe "mapped range ArrayBuffers" in terms of copies #3124
Specify that texture_external sampling functions clamp coords to [0, 1] #2766
Should objects have pointers back to their parents? #1497

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU Web 2022 07 06

GPU Web 2022-07-06

Tentative agenda

Attendance

Administrivia

CTS Update

“Tacit Resolution” Queue

Allow depth textures with cube/array/multisampled types #3070

APAC-member-prioritized topics not already on the agenda?

Criteria for pushing the WebGPU and WGSL specification to Candidate Recommendation #3107

Specify that texture_external sampling functions clamp coords to [0, 1] #2766

Should objects have pointers back to their parents? #1497

Agenda for next meeting

Clone this wiki locally