Skip to content

GPU Web 2020 04 20

François Daoust edited this page Dec 1, 2020 · 1 revision

Chair: Corentin / Dean

Scribe: Ken / Austin

Location: Google Meet

Tentative agenda

  • Binding to zero-sized range of a buffer #686
  • Pipeline statistics queries #691
  • Copying of depth24plus formats #652
  • Add minimumBufferSize to BGL entry #678
  • Agenda for next meeting

Attendance

  • Apple
    • Dean Jackson
    • Justin Fan
  • Google
    • Austin Eng
    • Brandon Jones
    • Corentin Wallez
    • Kai Ninomiya
    • Ken Russell
    • Ryan Harrisson
  • Microsoft
    • Damyan Pepper
    • Rafael Cintron
  • Mozilla
    • Dzmitry Malyshau
    • Jeff Gilbert
  • Matijs Toonen
  • Mehmet Oguz Derin
  • Timo de Kort

Administrative items

  • CW: Please re-review the charter WG. This is not the final review, but to get everyone on the same page. The final review will be reviewed by literally everybody at the W3C.
  • DJ: Jeff Gilbert, you should at least get David Baron involved.
  • JG: will ping him again.

Binding to zero-sized range of a buffer #686

  • DM: technically an independent issue. Binding a zero-sized range of a buffer is different. A buffer is just an array of things on the GPU. Code may want to work with this array even if it's empty.
  • CW: something that would be convenient is that buffers are as big as the constant sized part of structs that are in the interface, plus one element of unsized arrays. Then you can just clamp indices. If we have zero-sized bindings we can't do that.
  • DM: do we have a strong reason why clamping rather than zeroing is preferred?
  • CW: imagine a big array of structures. Lights, fairly big structures. Or particles. You get a pointer to one of these particle structures, then you'll dereference multiple times. Ideally, clamp only the time you create the pointer. No extra ops on every pointer access, just on pointer creation.
  • DM: we support pointers in WGSL?
  • CW: I thought you could take pointers to inside of structures. Variables are essentially pointers in WSGL.
  • JG: they are but you can't do arithmetic on them.
  • CW: you say "var myParticle = particles[i];" and particle won't be a full copy of the struct. A pointer to one of the particles in the array.
  • DM: why doesn't the user do "const var = ...;"?
  • JG: not sure those details are necessary here. The idea: we don't have arithmetic pointers, but we do have logical addressing mode style pointers, so we don't do the copies implied.
  • CW: a bit like GLSL. When you reference a structure that's in the interface, as long as you don't write to it you assume you move around the pointer.
  • DM: if we have the pointers I understand why you do it. But you still have to handle out-of-access bounds on the array. Doesn't have to do with zero-sized bindings.
  • CW: it does depending on what we want to guarantee. TO make robustness cheaper, want to guarantee at least 1 element of unsized arrays. Then you can clamp once. You get the pointer and know it's a valid pointer. Every further memory op on that pointer doesn't have to produce any guards.
  • JG: think you can do that with zero-setting. Just need more constant space.
  • CW: I think in SPIR-V to do that you need pointer variables or pointer conditional expression, which we don't have in logical addressing mode.
  • DM: you're saying, if we have min buffer size, and ..., then this is a non-issue.
  • CW: then we cannot allow zero-sized bindings.
  • DM: I agree. So are we abandoning that idea and jumping on min buffer size?
  • CW: it's all interlinked. Assume we don't have min buffer size. How do we enforce robustness in general? Draw-call validation assuring buffers are big enough?
  • DM: we can mark that issue as blocked on min-buffer-size, and go back and close it. Resolve out-of-bounds first, then come back and see whether we need to support zero-sized bindings.
  • CW: sounds good.
  • JG: future-proof would be to forbid zero-sized bindings. Would be a breaking change if we took them away.
  • CW: yes. There will be validation for robust buffer access somewhere. This validation will prevent you by construction from having 0-sized bindings. Not disallowed per se, but when you start using them, some other part of the validation will complain, at least.
  • CW: How do people feel about blocking this on more discussions of OOB and discuss other issues?
  • RC: SGTM

Add minimumBufferSize to BGL entry #678

  • DJ: Myles told me he has some hesitation on this one.
  • CW: I think Myles was saying that it's something people don't provide on any API right now, so it adds friction.
  • DJ: he's worried it'll be unnecessarily harder on developers.
  • DM: alternative is being hard on our runtime CPU performance. We have precedence on adding things to BGL that aren't in native APIs.
  • JG: I think the concern is that this is something most content simply doesn’t know. You may have an engine that doesn’t know what its minimum buffer size is. Would require more refactor to figure this out at submit time. And that’s a tradeoff. In WebGL we have a ton of “draw-call-time” checks, but we can cache a bunch of them. I wonder how much our prototyping of the API without minimumBufferSize would advise how tractable it is to get good performance without requiring it.
  • CW: we could probably prototype it over the next month or so, in Dawn, without minBufferSize.
  • JG: Does Dawn have minBufferSize now?
  • CW: no. We could try implementing it in a smart way and see how much overhead it adds. Idea: set bind group, have bunch of numbers, and we can do SSE compares of arrays of numbers. Maybe it's cheap.
  • RC: part of prototype would also be, how much perf is saved by not doing bounds checking.
  • JG: if you could skip bounds checking, assuming well formed content, how much would you save?
  • CW: you can't skip bounds checking. Uniform buffer, has matrix, sent as array of 16 floats, and you try to access 17th float - or 1700th. Still have to clamp in the shader. Don't think we can skip bounds checking even if we do this validation.
  • JG: part of idea was this is optional. Providing it allows us to omit some of these checks. Not providing it - part of the PR. Would it be acceptable for content to be able to skip some of this, in the effort to get content running?
  • CW: so it's a hint, and if you don't set it it still works?
  • JG: still state-of-the-art. We're trying to do better than WebGL in this area.
  • CW: But then you get draw call validation
  • JG: I guess it depends on where you're doing the clamping, but yes.
  • DM: trying to think of a case, can't imagine one, where engine developer would not know this. Shaders have structures, used in buffers. They know what buffers contain what structures. If it's not their shaders then they have to ask the user for this data. Maybe we can ask the developers, but from my ISV experience I don't know when they wouldn't know this.
  • DP: Some drivers on PC do these optimizations automatically. At runtime will profile the type of resources they get bound with and then switch and use a different flavor of shader. In this case if they can determine statistically that every buffer bound is >= 1000 bytes, then they can produce a version that does no bounds checking.
  • KR: so they're dynamically recompiling pipelines?
  • DP: shaders.
  • JG: makes sense - another JITting pass.
  • DM: sounds contrary to low-level APIs.
  • DP: Does, but the wins are so good, so it’s hard not to do. PC games industry is driven by a relatively small number of big titles. So it’s good to do it
  • DM: if D3D12 had that minimum buffer size you'd have less code and more performant apps.
  • DP: different for D3D12. We don't add in bounds checking like we're talking about for WebGPU. More about constants being folded into the shader. Can determine some code paths won't be executed. I think this will be activated after a while if you toss this at the drivers. (If the app pushes the right number of frames through the driver)
  • JG: would be nice if we knew up front and they could volunteer this sort of information.
  • DP: The other way is a sort of strict mode. The shader can report stuff back and say don’t give me a buffer small than X.
  • JG: "require buffers larger than". ~3 proposals here.
  • DP: this proposal is to do more validation at bind time, to reject buffers that are too small?
  • DM: bind group / pipeline creation time. The point is to avoid late runtime checks.
  • KN: folds into BGL-pipeline layout compatibility checks we already have to do.
  • DP: think I might have used the wrong term.
  • CW: JG you mentioned 3 proposals: there’s minBufferSize, there’s draw-time validation. What’s the 3rd?
  • JG: implicit to every proposal is the default-null proposal. IMO 3 paths forward. Keep being implicit like now; add this as a requirement at BG creation time, saves some runtime submission checks; or, what if this were just a hint. Could provide it, but also say "min buffer size is 0, 1, 4...". Could be a reasonable default. Apps that don't have this info would have the unhinted version, or maybe smart drivers don't need this information, but we don't know up front.
  • CW: My question is: with the no-op proposal, how do we ensure there’s no OOB in the buffer? What I’m trying to get at is a proposal of what happens if the buffer is to small. Either the draw call is no-oped or there’s an error.
  • JG: In my mental model we've chosen to inherit some lax version of robust buffer accesses. We have a solution for what happens when these are out-of-bounds. Forbidding these accesses is one way. I thought we already had spec'd behavior for OOB accesses.
  • CW: we do, just not sure how to implement it on HW that doesn't have robust buffer access.
  • DP: was the proposal not to add bounds checking to each access?
  • JG: that's what some platforms do. Like, texel fetch inside shader. Some APIs give access to image size you're sampling from, so you can bounds check image fetch which doesn't have robust buffer access behavior in some native APIs.
  • CW: seems we're a bit stalled on the discussion. Need to think about it more offline it seems. Also, part of min-buffer-size discussion has to be discussion of how we do robust buffer access. How we'd implement it on HW that doesn't have it.
  • DM: think that's in the pull request. Maybe in condensed form. What do we do on this/that access?
  • CW: in particular, need to figure out if we can do robust buffer access when the buffer's too small.
  • DM: so stuff outside the proposal.
  • CW: yes.
  • DM: worth adding bigger picture here. Relying on robust buffer access is tricky because it works within bounds of buffer. Can be anything in the buffer. Don't know what is returned when you go out-of-bounds. Portability problem. This min buffer size says that you have the same behavior everywhere; no bounds check needed.
  • RC: suppose you pass something in matching the min buffer size and then in the shader you access outside the buffer. Don't we still need robust buffer access?
  • DM: we just need index clamping.
  • RC: this PR then is for things within a structure. Still robust buffer access with/without this.
  • DM: for accessing arrays specifically, yes.
  • DP: dev needs to know that size. Increasing size of struct in shader and not updating code, last field will have bounds checking.
  • DM: pipeline creation will fail. The first thing when you change the shader. Pipeline knows about PL layout, which knows about BGL. PL will match that number with whatever the shader expects.
  • DP: so shader compiler can calculate this size?
  • DM: reflect it, yes.
  • CW: shader compiler computes this size. PL creation computes min size shader needs. Get min size in layout, then see whether it's >= what the shader reflected it needed.
  • DP: that strikes me as something the user doesn't need to specify. Users can opt in to saying, this particular buffer needs to have the size check applied to it. Thinking about number of places needing to be updated to apply this change.
  • DM: you'd change all creation of BindGroupLayouts. Like ..., texture format, which aren't in the native APIs.
  • JG: I worry it's at a place where ... it's a design air-gap. Places in the API which deal with shaders tend to have more distance than API usage, esp. in game content. Think it's potentially a cool idea to infer the min buffer size. If you did (of non-dynamic-indexed parts of binding), made that a requirement, think you get most of the benefit of this without requiring the dev / app to spec anything new. Implicitly saying min buffer size == reflected min static length.
  • DM: so you'd fail setBindGroup call?
  • JG: whatever the behavior is in this API. Parts of this proposal - you get for free if min size is at least the static size (or that +1 dynamic index) from the shader. As long as there's defaut, not a lot of concern. Inherits the static size.
  • CW: checked at draw time.
  • DM: I think that’s not consistent with the API. The reason we added textureComponentType is so we can fail early.
  • JG: what I'm describing is doable at pipeline creation time.
  • DP: DM in your proposal is there no check at bind time?
  • DM: at BG creation time only.
  • JG: There has to be a check at binding time. Otherwise you can bind too-small a group.
  • CW: at draw time one thing you need to check is that layout of BGs matches what PL requires. We've been adding lot of things to the layout so that we get it for free when we bind. This adds min buffer size to that set of requirements.
  • DP: this is at draw call time?
  • CW: at draw call time have 4 pointers that have to be checked.
  • JG: that's not per draw call.
  • CW: you can do dirty bits, lazy, etc.
  • DP: If I make a BGL that says 64 bytes, and I make a 2-byte buffer, when is the error reported to me?
  • JG: when you create the BG based on the binding and the buffer.
  • DP: I'd say that's bind time.
  • JG: when we're saying bind time a lot of us think about setBindGroups call. Same amt of overhead as draw call time.
  • DP: Okay, so creating the bind group is when you associate the layout with the buffer itself. DM is that where you're saying the check happens?
  • DM: yes. Also wanted ot say: you don't have to compare contents of BGLs. We give you a unique ID for BGL, just compare the ID. The check is still the same.
  • JG: because we dedup upon creation.
  • DM: yes, so no add'l checks when adding min buffer size to BGL.
  • JG: What I like about this proposal is that it does it at that time. I wish that it had a sane default, basically.
  • CW: minBufferSize is a thing, but it's opt-in. Don't set it, it's 0xffffff. Draw time, for each binding that doesn't have minBufferSize, there's draw-time validation that buffer is bigger than what pipeline needs.
  • DM: not consistent with other fields in BGL. We check ahead of time so we don't fail later. Complication: when loading the shader, you don't know anything about what BGL is. You'd have to generate 2 shaders, one doing checking for all fields and one which doesn't. Becomes tricky.
  • JG: Yea this is one of the reasons why everyone ends of making “compile” a no-op, and link or use-time where you actually synthesize the shader. This would be another case where instead of generating the shader at ShaderModule creation time, you’d defer that till Pipeline creation.
  • CW: we already do that on at least 2 backends.
  • DM: We don’t want to do that. With this proposal, you are requiring
  • JG: My feedback is that we can have that without requiring users to jump through the hoop of knowing how big their structs are. We can say you must always have some minimumBufferSize. I don’t see how to get the thing you want at acompileShader time either, without.. part of this proposal is that you will almost never have to do bounds checking of the struct component, only the dynamic indexed part. But that can be completely implicit and done by the implementation. The impl knows the minimumBufferSize and can do that internally. Big fan of never doing bounds checking on the struct content, but asking users to specify the size,: i’m not comfortable with.
  • DM: I know what you're saying. Downside is there can be surprises. User will feed any kinds of buffers and they'll have to learn about the restrictions and obey them, just more annoyingly.
  • DP: I think there's value in training people to make their buffers big enough. If you were relying on robust buffer access to provide correct behavior, you're heading for confusion later anyway. I don't think 0 works as a default assuming it should fail if you pass in a buffer smaller than required by static compilation. MAX_INT doesn't work either.
  • JG: 0 is not the worst thing to use as a default sentinel for something that can’t be zero.
  • CW: in JS, whether you set it or not is a thing.
  • JG: we can do Maybes. If you don't set it it has this behavior.
  • KN: we do that in a bunch of places.
  • DP: I think that's what we have to do in this.
  • DM: I think it needs to be required for buffers. That's what this PR is about.
  • DP: "there's a magic number" that the system calculates based on your shader and you have to pass in that magic number when creating BGL. True?
  • DM: Remember the discussion about field offsets in WGSL? The user needs to know where the fields are and what size the struct is in most cases. They typically create the structs on the CPU and pass them to the GPU.
  • DP: I understand. Offsets into structs is something we need to do to make compiler agree with structs they fill in. This min buffer size is a hoop we're making them jump through unnecessarily.
  • CW: It’s a hoop yes, and it saves us CPU overhead.
  • DP: how does it save CPU overhead? We can calculate this automatically and not require developers to type it in.
  • CW: you pass in pipeline on one side requiring 64 bytes. On other side, create bind group, don't know it'll be used with this pipeline. Only point where you know they need to match is at setBindGroup time.
  • JG: I think we agree with that. I’m trying to draw attention to a part A and part B. I think we all agree up to some point, but we’re talking about whether or not the full proposal is what we want.
  • CW: you're saying there's a spectrum between not having min buffer size, and doing validation every time, or: you can opt-in to minBufferSize.
  • JG: No, I’m not proposing that. Me and Damyan are proposing: because at shader compile time, we know the minimumBufferSize, when we do the check that the BGL matches the shader at pipeline creation time… since we already know what the number has to be, we just infer that it’s equal to whatever we already have at shader module creation time.
  • CW: how do you make sure bind groups created with the layout have buffers big enough?
  • JG: OK, I see.
  • CW: I agree it's nice middleground ... still need to make sure we don't have out-of-bounds. We have magic thing, getBindGroupLayout, could say that the min size is whatever's needed by your shader.
  • JG: the ergonomics between that and a default value are really far apart.
  • CW: getBindGroupLayout is synchronous.
  • JG: not in our impl. If you make it sync then you have to have that info client-side.
  • KN: getBGL gives you an opaque pointer. Just gives you BGL which contains the minBufferSize.
  • JG: OK.
  • CW: to make progress on this..
  • JG: need to think about it more.
  • CW: ...need home work so we make progress over the week. We should write down the spectrum of options, and what assumptions we have in terms of how shader does out-of-bounds checks. Once we have the spectrum we can talk about it more.
  • JG: we need to both know what the proposal is, and what happens if we don't do the proposal. Some lack of clarity there.
  • DP: our spec currently has text about out-of-bounds accesses in shaders.
  • JG: thought so.
  • RC: spec doesn't have getBGL. Is that a proposal?
  • CW: we discussed it, agreed on it, but it's in a poor quality PR right now.
  • JG: I think we didn't agree to it but agreed to look at it in a PR.
  • CW: it's #543 and I'll find the minutes.
  • RC: it's nice to say, here's a thing, find me a BGL so I don't have to match it myself as a web developer.
  • CW: sorry for not pushing this CL through.

Agenda for next meeting

  • CW: Pipeline statistics query. Probably reviewable offline.
    • JG: think so.
  • More of the minBufferSize / OOB stuff
  • Copying of depth24plus formats
  • If more time: mapAsync PR, writeBuffer?
    • CW: put up mapAsync PR, please take a look.
  • https://github.com/orgs/gpuweb/projects/1#column-6543625
    • KN: we set up the project tracker with "needs-discussion" column. Should be source of PR burndown. Can find issues needing discussion during the meeting here.
    • CW: great, so PRs not needing discussion can go through without being discussed at meeting.
    • KN: and issues that aren't PRs we won't miss.
    • JG: can we have a process for what non-controversial-consensus looks like?
    • CW: yes, Greg Roth wrote up a doc.
    • JG: when people don't comment on it and don't bring it up at a meeting, things stall. If we had a deadline for immediate decision...
    • CW: here there are PRs that will go through, spec editors will say this one needs to be looked at in a meeting. Smaller ones can just go through.
Clone this wiki locally