Skip to content

Minutes 2022 03 23

Corentin Wallez edited this page Mar 29, 2022 · 1 revision

GPU Web 2022-03-23

Chair: CW!

Scribe: KR/AE

Location: Google Meet

Tentative agenda

  • Administrivia
    • PING and TAG reviews
  • CTS Update
  • “Tacit Resolution” Queue
  • Concerns around error label and message #2633
  • Are duplicate timestamp locations allowed on the same pass? #2662
  • Streaming implementations and indirect draws/dispatches #2189
  • GPURenderBundleEncoderDescriptor needs maxCommandCount if it is to be implementable on MTLIndirectCommandBuffer #2612
  • (stretch) Make rg11b10ufloat RENDER_ATTACHMENT:yes and resolve:yes. #2659
  • Agenda for next meeting

Attendance

  • Google
    • Austin Eng
    • Brandon Jones
    • Corentin Wallez
    • Gregg Tavares
    • Ken Russell
  • Mozilla
    • Kelsey Gilbert
  • Unity
    • Brendan Duncan
  • Mehmet Oguz Derin
  • Michael Shannon

Administrivia

  • PING and TAG reviews
  • CW: Still need to go through horizontal review. CW push them on this, and they said “we’ll [re-review] it soon”
  • CW: Can we start the PING review now that we have an explainer for adapter identifier?
  • BJ: haven't gotten much farther. PR is up against the spec now. Waiting for Myles' feedback.
  • CW: let's try to get to resolution so we can start PING review. If they give feedback might take time to address it.
  • BJ: agree, we should ping the PING, and they can ask questions about existing PRs rather than waiting for them to get into the spec. Also, anyone else's input is welcome.
  • CW: next week should we say "good to go"?
  • BJ: think we could file for PING review anyway, now. Main points unsure about: minor algorithmic things. Should this Promise reject? Allow through, and return null data? Won't change the PING's output much. Don't think we need to wait for approval.
  • CW: are you volunteering to reach out to the PING?
  • BJ: I guess so. :) Will look into it.
  • CW: other horiz reviews:
    • Security. Requires splitting privacy and security sections
    • BJ: already done.
    • CW: OK. I can take care of that.
  • CW: Internationalization.
    • Now that Unicode's in the WGSL spec, think ready to ask for review.
    • I can take this one, or - Oguz - would you be interested in pushing this review out? #2546
    • OD: OK, I'll look into it.
    • CW: let us know if you want to take it.

CTS Update

  • Google-internal CTS hackathon completed!
  • Ton of contributions.
    • 26 PRs Merged (and a few still to land)
    • +5139,-2275 LOC
    • 11 committers and counting
    • ...and a more complete CTS for all!
  • Thanks Kai and Shannon Woods for organizing this!
  • CW: please try running this on your CQ. Please also contribute to the CTS! Still a lot of stuff to do.

“Tacit Resolution” Queue

  • Clarify meaning of dispatch() args. #2669
    • BJ: lot of folks have run into this - most recently Gregg. Most people think the argument is the number of invocations, but it's the number of workgroups. Did Twitter poll, Anything with the name "workgroup" in it helped clarify understanding.
    • Think simple change here would help. Break from Vulkan/D3D, but similar to Metal's name. Not unprecedented among native APIs.
    • CW: one more data point: I wrote the impl of Dawn's dispatch() and at least twice I forgot it takes workgroups.
    • BJ: I messed this up myself too. It's easy to mess up and not realize you did it. Can get the right result. You'll hit the end of the array, and keep writing into that last index. That's what Gregg was seeing. Did exactly this in my Metaballs demo - and was doing 8x the work I should have been.
    • CW: if you disagree with this change, go onto the issue & disagree there.

Concerns around error label and message #2633

  • CW: Marco Caceres (?) found fingerprinting vectors in spec
    • App gives object labels, so no fingerprinting concerns there
    • Discussion around reconsidering passing various messages to the objects; output in console instead
      • Could fingerprint version of browser, locale, language of user
    • KG: Not killer portability concern; not something developers run into in practice; it's exactly what WebGL does. Returns different error messages, we haven't run into this problem there. Could choose to do something better for fingerprinting, but UAs are free to sand the rough edges off the browser.
    • CW: agree, can detect browser version in a bunch of different ways. Can explain why we dismiss this suggestion.
    • CW: USVString undefined vs. null?
    • KG: most people think undefined's what it should be, but it's easier to write "?" in the spec.
    • CW: how about - say we don't want to fix points (1) and (2), and open an issue for (3).
    • KG: does undefined coerce to null?
    • CW: if no disagreement, we'll do this and move on.

Make rg11b10ufloat RENDER_ATTACHMENT:yes and resolve:yes. #2659

  • MS: this would make my life easier.
  • CW: optional feature?
  • KG: demoting Adreno 500 to a second tier? Know we'd have to polyfill it.
  • CW: we haven't defined a WebGPU/Compat here. If we had it, that'd be one where Compat wouldn't have that capability.
  • KG: having a V1 release of the spec not trivially implementable on Adreno 500. Or, not have it be a real subset - knowing V1 release would need nontrivial polyfills on there…
  • KG: idea is it's implementable eventually on 500.
  • CW: is anyone going to implement that?
  • KG: this is the thing we were talking about doing for RG11B10F.
  • CW: understood. The complexity of the workaround seems high.
  • KG: is it too high? Not just
  • KR: Stephen white has been working on a GLES backend. There’s a significant number of features that don’t match up and cannot be implemented in their entirety in GLES. There will definitely be aspects of core V1 that cannot be implemented. That’s why it’s important to define this compatibility mode.
  • KG: I’m not saying to polyfill to whole spec. THere’s now three levels:
      1. V1 understanding Adreno 500 cannot handle all of them
      1. Contentious, reduced core portability mode
    • 1.5: Not trivially implementable, but satisfactory path toward V1 which would take longer than we want.
  • CW: how do you see this being implemented?
  • KG: discussed that in previous meetings.
  • CW: use temp render target? Use compute shaders to pack? This format doesn't support storage. Only way to write to it - copyDestination. Take texture, compress it to, say, float16. Compute shader takes that, packs data in RG11B10ufloat buffer, copy buffer to final texture.
  • KG: yes. We do some things like this in WebGL.
  • KR: on the CPU side.
  • KG: FF uses some intermediate textures. Blitting between linear/nonlinear sRGB targets for example. Tolerable performance. Formats are useful. Let people use the formats. Places where it's only tolerable go away over time.
  • CW: say we do a WebGPU V1 with this. Now let's say we define a Compat mode later. Won't it be natural that this capability be removed, put in Compat, and don't do it for Adreno 500?
  • KG: yes, but you lose out on other things Compat doesn't have.
  • CW: this hardware goes away over time.
  • KG: understood. Also - more likely that we take this in V1 and polyfill it, instead of group taking a core proposal. Think we can polyfill this.
  • CW: I think we can polyfill this, but don't think we will, unless WebGPU Compat doesn't fly. I don't think we'll bother handling this specific format.
  • KG: think the perf bar's lower than you think it is.
  • CW: maybe, but we have a lot of things to do. Really don't think Chromium or Dawn will handle it, and just punt non-supporting hardware to Compat instead.
  • KG: maybe that's what you want to do. I'm pessimistic about WebGPU Core being a cross-browser thing. Can either leave this, say "not now", make authors sad - or we take it.
  • CW: or we add an optional feature. 3 choices.
  • CW: clear to us that we won't do the workaround if there's a compat mode.
  • KG: Not going to fight hard about this. I think it’s the right choice to have it in core, but if you want it as a feature; that’s ok too. I just fear it gets into the format soup a little bit.
  • MS: not sure about other folks that want it - we'd be fine getting it down the road when we can count on Adreno 500 not being a major platform any more.
  • KG: willing to wait 2 years?
  • MS: Desktop is where it saves the most (on memory), but that’s also where it doesn’t matter as much.
  • CW: reason I'm not pushing harder for optional features - browsers might say, I don't like this web page, not going to give it lots of optional features for some fingerprinting reason.
  • KG: I’ll write an optional feature - and we can make it core if we want to.
  • CW: sounds good thanks. Will check the numbers for these devices in the field, and how many are being sold today.
  • KG: I remember it being 20% of the Android market.
  • CW: I it’s 20% and it’s going away, maybe it’s fine. If it’s 20% and still sold today, that’s less good.

Consider putting most of the writeTexture validation on the device timeline. #2680

  • CW: We have validation that linear data to texture copies checks that the data fits in the texture. For writeTexture, the validation is specified as running in the Content timeline, and runs a JS exception. If you pass an ArrayBuffer too small, you get an exception. different from copyBufferToTexture which produces a validation error in the error scope.
  • CW: idea for this - browser wants to optimize amount of data transferred from content -> GPU process. Compute size, etc. to know what to send. Doing everything in content process -> duplicate bunch of logic, data. Also don't have all info there.
  • CW: idea: put validation back in GPU proess.
  • KG: I do feel like it’s weird that writeBuffer and writeTexture have different error kinds. If it’s true they’re different, we should probably make it the same. For the rest: mixed feelings.
  • KG: we do weird things in WebGL. There we can do it later. Original version of WebGL-OOP in Gecko copied the whole ArrayBufferView section whenever you uploaded a texture. Bad idea, added optimization later - guess upper bound of how many bytes we needed to transfer. Eventually, made that the same code.
  • KG: benefit of punting to GPU - you know how much data to copy from ArrayBuffer - browser's free to do better or worse. Then mandatory validation happens in GPU process.
  • KG: So, weak yes.
  • CW: What you described is exactly the process we’re going through in Chromium. We started optimizing by computing an upper bound, etc. Then found the spec required a synchronous error.
  • CW: computing tight / exact upper bound in content process is fairly simple. Just a switch statement for block size of various formats. If we can do this, that would be great.
  • CW: writeBuffer checks a bunch of small stuff on content timeline - easy to do inline.
  • KG: just keep it, I think.
  • CW: maybe resolve this one offline.

Streaming implementations and indirect draws/dispatches #2189

GPURenderBundleEncoderDescriptor needs maxCommandCount if it is to be implementable on MTLIndirectCommandBuffer #2612

Are duplicate timestamp locations allowed on the same pass? #2662

Agenda for next meeting

  • mapSync on Workers - and possibly on the main thread #2217?
    • On the critical path - let's try to talk about this first, timeboxed.
  • Carry over agenda from this meeting because folks were away for this meeting.
  • Please add topics here.
Clone this wiki locally