Skip to content

Minutes 2021 07 12

Corentin Wallez edited this page Jul 16, 2021 · 1 revision

Web 2021-07-12

Chair: Jeff Gilbert (guest-chairing for Corentin)

Scribe: Ken / others

Location: Google Meet

Tentative agenda

  • CTS update
  • Remove the read-only texture storage bindings #1794
    • Change read-only texture semantics #1864
    • Allow read-write storage on selected texture formats #1772
  • Should the API enforce that writable storage buffer bindings don't alias each other? #1842
  • Remove the requirement for Vulkan's drawIndirectFirstInstance feature. #1878
  • Expose all interfaces to DedicatedWorker #1883
  • (stretch) Add opaque region or isOpaque hint #1871
  • (stretch) Discuss WHATWG feedback on ‘gpupresent’ context type #131
  • Agenda for next meeting

Attendance

  • Apple
    • Myles C. Maxfield
  • Google
    • Austin Eng
    • Ben Clayton
    • Brandon Jones
    • Kai Ninomiya
    • Ken Russell
    • Shrek Shao
  • Microsoft
    • Damyan Pepper
    • Rafael Cintron
  • Mozilla
    • Dzmitry Malyshau
    • Jeff Gilbert
  • Francois Guthmann
  • Michael Shannon

CTS update

  • KN: Have improved amount of CTS work we're doing. Lot of stuff getting done, but many many more things to do and test.
  • KN: Contributions still greatly appreciated.
  • BJ: every time I try to do CTS stuff I have to figure out the format again, but if you do it regularly it becomes much easier. Encourage people to poke at it regularly. Much easier to be productive.
  • KR: any high-level areas people should focus on?
  • KN: end-to-end rendering and compute tests would be both fun to write and extremely useful. So if you want to write something that's a pseudo-demo for the sake of testing, that's something that would be appreciated. Can do it in standalone and then I can help you bring it into the CTS.
  • JG: recommend not being too cool - ensure we're rendering properly, and debug when we're not. More stuff you would put into a demo would make it hard to debug.
  • KN: also some value in having complex tests. Might find issues we don't find when we try to write simpler tests. Could port some demos into tests, if we think it'd be useful.
  • KR: suggesting not really big complex tests, but things like mipmap level selection, single trigonometric function test in a shader, etc.
  • KN: good point. Shading language is still in flux, but can still be done.
  • MM: what if you want to call a function with all possible arguments? Run the first iteration, wait for results, then check them? Or queue up 10,000 invocations and check all of them?
  • KN: if it's just many different data inputs to a function and you want to check the outputs, there's no real value in being able to check the difference between them until you're debugging them. Suggest writing a shader which does all the tests and writes red if something went wrong. We have a pattern we use in one place - do many checks in the shader, return a different value depending on which check failed (accuracy too far off with call with particular arguments, etc.). That way tests run quickly and don't generate too much spew. Different overloads - might be useful to have them be separate. That way you can see which overload went wrong. Totally up to the test author though.
  • JG: don't be too proactive in trying to optimize the test. Primary reason is to figure out what went wrong when something does go wrong. Some hyper-optimized tests only write red/green but do a ton of vertex setup. Hard to figure out where the logic went wrong.
  • KN: definitely true. Main reason to care about speed of test - if you want to do 10ks of iterations, you can put hundreds of tangent calls in one shader, do many variants on it. Starts to matter whether it's fast. Recently had one where I combined several things into a single shader so we didn't have to create lots of pipelines. Couple hundred things in a single shader. Enabled broader testing.
  • MM: how long does CTS take to run on a normal machine?
  • KN: on our automated testing infra, we only run on a single machine, no sharding. ~20 min. We normally don't run all of them. And not running in the standalone harness. Usually if you run all of them right now, one will crash. We run a subset locally, esp. for development. But if making changes to WGSL parser and want to run all the WGSL tests, we don't have a great way to do that right now except running through WPT infrastructure, which is a bit of a pain.
  • BC: haven't started yet, but on my list of things to do. Would still like to improve the web-based harness too. Not much visualization of where it's at, status, etc.
  • KN: have many ideas about that as well - but lot of work to write a new UI.
  • KR: might be able to borrow some code from the WebGL CTS UI - which does have better progress reporting, hierarchical runs - and Kai contributed substantially to that. :)

Remove the read-only texture storage bindings #1794

  • DM: We have a choice: change RO storage to different semantics - only apply to single-channel 32-bit texture formats (3 of them) - or remove it completely. Talked with Corentin, he's fine with either. Thought more about which way is best - found new concern - if we keep RO storage textures with new semantic, users may be confused. They have RO storage buffers, can be combined with other usages - but RO storage textures can't be. Leaning toward just removing them, but open to opinions.
  • MM: what are the new semantics?
  • DM: UnorderedAccessView on D3D. User's POV - same thing, just available on only 3 texture formats. Has to be exclusive usage. Can't be combined with sampled or anything else.
  • MM: ok, seems like a reasonable semantic. Not suggesting a preference of keeping vs. removing.
  • JG: we can make additive changes in the future. We can add them back later. I was opposed to this thrashing earlier on, but now have learned severe restrictions on RO textures, so in light of those I'm less enthusiastic about them. Would hope in prototyping and pre-release, people will complain if it's missing and they needed it.
  • MM: agree with that direction.
  • DM: unless someone wants to defend the other position.
  • (silence)
  • DM: ok. Will remove it.
  • JG: thanks for the investigation that went along with this.

Allow read-write storage on selected texture formats #1772

  • DM: somewhat blocked on understanding which Metal devices would support this. MM said they can't give this to us a priori, so we need to spin up some in-browser telemetry.
  • MM: that's right.
  • JG: given that, still waiting on telemetry, unless we think we can make progress on it without it.
  • MM: if that's the path forward, until we have the telemetry, should say this isn't allowed.
  • DM: today, we can spec the feature as optional, and if telemetry says it can be in core, we can just move it into core.
  • MM: sounds reasonable.
  • JG: maybe good path - do some investigation, and if someone wants to champion it, they can produce spec for the feature. Will mark Needs-Specification.

Should the API enforce that writable storage buffer bindings don't alias each other? #1842

  • JG: requires effectively bind-group-change validation. Maybe not big, but adds per-draw-call overhead. Eliminating that overhead is a primary reason for WebGPU's existence. You have undefined behavior if you allow this, and we would prefer to not have that.
  • MM: what I said last time still applies - still need to know the cost.
  • DM: that's one position. Another one - we already have UB in non-aliased writable buffers, etc. This won't make us any more undefined than we already are. It's impl work to both validate and test it. Understand that we have to measure it. We could define this as UB and move forward. Corentin has very strong feelings about this and was inline with my feelings.
  • MM: this is UB that happens in different set of circumstances than we have today. We'd be increasing the number of situations we have UB. These circumstances are distinct. One: lots of threads interacting with each other. The other's possible with a single thread, no races. From programmer's point of view, first seems more fraught, would expect to have to be more careful. Think it's reasonable to have two different policies.
  • DM: if you have single thread, is it not UB at this point? Thought it's still UB unless you have some errors between writes and reads.
  • MM: it's not UB in Metal. The compiler pessimizes and generates slower code.
  • JG: can we not mark this in Metal?
  • MM: C has the restrict keyword. C++ doesn't have it but some compilers still accept it. Not sure if Metal accepts it.
  • DM: so Metal's unaffected by this decision, because it already makes a decision internally.
  • MM: would be an area for future optimization.
  • JG: there's still a gradient between the attractiveness of the UBs.
  • MM: one's a situation where as a programmer I'd have to expect to be careful; the other, I wouldn't.
  • MM: want to be reasonable - don't want to say this has to be required - if the validation cost is low then we should add it.
  • JG: if the cost is low and the utility exists.
  • MM: correct.
  • DM: do we have anyone from Google who's planning to prototype this and measure the cost?
  • KN: we don't have a plan specifically. We'd love to. Many things we'd like to.
  • JG: think we should park this for the moment? If we want the data, we should pause on this. Right now this is just allowed. We could leave it allowed for now, do prototyping on what the validation would cost. Try that out later, come back to this. Could also see people prototyping with WebGPU.
  • MM: we should be more restrictive and then open things up.
  • KR: this would be the first per-draw-call time validation in WebGPU, can significantly impact scalability of apps. Concerned that we're putting this in without measuring it, spec'ing it, testing it, and then measuring it later.
  • MM: this isn't the first draw call time validation. DM had a few others.
  • JG …
  • MM: what if nobody does the investigation by V1? Then we can never remove it.
  • JG: we can spec, implement, and test. Want to disable validation for trusted code, internal benchmarking. There were cases in WebGL where we had to break content. We did ship breaking changes, which we should try to avoid, but it's not the end of the world. We should be at least 90% sure we don't have to do this. Also, in some way, I think if we don't specify this - one concern is if nobody does the investigation? That's an indication people didn't think it's important. Sort of self-corrects.
  • MS: what if it's undefined whether the impl does the check?
  • JG: we try to avoid that. Might be acceptable if it had undefined ordering. Some amounts of UB that are acceptable. Should try to narrow down exactly what the UB is in this case.
  • DM: along same lines - we don't want users to use this, but don't think we can prioritize this. Should we put something in the spec that they shouldn't do this? File bugs about adding validation? Once first impl has results, they can say, can we fix the spec instead?
  • JG: yes, that's like spec'ing it with a more detailed list of instructions. Spec it, don't do this, even though we don't validate it.
  • MM: I think that's an acceptable compromise.
  • JG: isn't that just specifying it as invalid?
  • MM: yes, but also have every impl not check it for a while.
  • JG: it's spec'ing it assuming we have incomplete impls.
  • KN: not sold on that. Doesn't feel like it matters that much. If no impl checks this, no app will discover this. Also, if it's truly same kind of UB as racing in the shader, I think there's very little value in having validation here.
  • MM: symptoms are different. Say 2 threads, writing "5" and "6" into the same spot. Value you receive might be 42. (Neither one.) But other case, 2 different bindings - one thread, loads twice, you won't get a number that doesn't appear anywhere in the shader. Symptoms aren't as dangerous. Situation's worse for thread races.
  • KN: don't you still have the problem where you store and load, and compiler assumes you get the same value back?
  • MM: that's true in both cases, but it's a little worse in the case of races rather than multiple bindings.
  • KN: you're saying aliasing a resource doesn't add any more UB than we'd have anyway?
  • MM: saying symptoms are different between these two cases. Mindset of a programmer is different in these two cases. Reasonable to have different policies.
  • KN: I understand the mindset issue. Don't agree it warrants validation.
  • MM: I can write a small example that would show this after the call.
  • KN: would probably be useful.
  • DM: haven't we resolved this? Or does Kai not want us to settle this by spec'ing it now?
  • KN: what was Corentin's position?
  • DM: he doesn't want draw call validation on top of what we already have.
  • KN: ok, I agree with that.
  • JG: another path - demonstrate that there's unacceptable overhead by implementing the validation.
  • KN: the validation can get complicated. Bind two different ranges of the same resource. Have to track ranges of resources, gets complicated.
  • MM: not particular about how the validation works. If validation tracking individual byte ranges is too slow, could re-spec the validation.
  • KN: if we write it with false positives and devs have trouble with it, validation will become more complex, or we'll have to take it out.
  • MM: both options are OK. My thesis: we don't need the validation but should start with it in place, and if devs complain then we dial it back.
  • KN: writing the validation knowing we'll have to make it significantly more complex means the measurements we take are less valuable than they would be.
  • MM: i'd say write the validation whatever way you feel like writing it, and look at the results. If acceptable, transcribe your code into the spec.
  • MS: given that this is pretty common practice to subrange larger buffers, not doing it based on ranges will make me angry, anyways.
  • JG: it's just an example of the complexities, but we do sub-range tracking in WebGL, too.
  • DM: generally speaking, subranging buffers in WebGPU isn't the best thing to do because our usage semantics are whole-buffer, not sub-range.
  • KN: need to figure out situation where this is problematic. e.g. two segments of the same buffer, both read-write.
  • DM: would probably want one of them read-only, but can discuss that later.

Remove the requirement for Vulkan's drawIndirectFirstInstance feature. #1878

  • JG: current WebGPU spec requires it, but Android doesn't have widespread support for it.
  • KN: spec doesn't mention this restriction - need to add it. We could require the feature. Unless there's an objection to taking this restriction into WebGPU, we just need to spec it.
  • MM: 40% of Android devices seem important to not lock out.
  • JG: should we have a feature to add this back in?
  • MS: yes, I think it's important. Gets around some multi-draw semantics. We could add multi-draw. Not sure support for multi-draw vs. this feature.
  • KN: agree. As we add that, we should look at the whole matrix of DrawIndirect things when we're trying to spec features for this. DrawIndirect, DrawIndirectCount, etc.
  • MM: one aesthetic: should we remove the argument or force it to be 0? If we remove the argument altogether it'll make it easier.
  • KN: it's not an argument, just one of the words in the indirect buffer. The format of the indirect buffer is dictated by the hardware.
  • MM: so in Vulkan you have this buffer format, and one of the cells in it has to be 0?
  • BJ: it does happen to be the last argument of that buffer, but buffers still have to be large enough to contain it. We could get by by saying that the buffer has to be at least N elements large, and we ignore the last one? Actually, not a good idea. Would have to go in behind the scenes. Best to require it's 0.
  • MS: I'd keep it. For MultiDrawIndirect, the buffer will need the value.
  • BJ: we could do tricks to hide its existence. Would require a lot of fixup. Copying to a new buffer. Don't think we want to deal with that.
  • JG: this will be Needs-Specification then.

Expose all interfaces to DedicatedWorker #1883

  • JG: also, importantly, removing the Serializable attribute from all the interfaces. This would be allowing you to use WebGPU from a worker, but everything has to be contained in that worker.
  • KN: yes. This doesn't add any multithreading stuff - just like what we have today in WebGL where you can use it from a worker. No complexity spec'ing it - some complexity implementing it.
  • MM: as long as we don't feel we're done committing this change - more work needed - then this is a good step.
  • KN: that's definitely the understanding. We have some MT spec work done, but we have to mark things as not working.
  • DM: what about transferring resources?
  • KN: this takes off all of the serializable attributes.
  • KR: there's a [Transferable] WebIDL extended attribute.
  • DM: command encoders can be transferred instead of serializable.
  • KN: we don't need them to be transferable though.
  • JG: there's a TODO here, define MT API, add [Serializable] back. Resolved to take this change?
  • (silence)
  • JG: taken this for now, will think more about Serializable in the future.

(stretch) Add opaque region or isOpaque hint #1871

(stretch) Discuss WHATWG feedback on ‘gpupresent’ context type #131

Agenda for next meeting

Clone this wiki locally