Minutes 2019 06 03

GPU Web 2019-06-03

Chair: Corentin

Scribe: Dean, Austin, (others?)

Location: Google Meet

TL;DR

PR burndown:

Add some more defaults in AttachmentDescriptors #268 or Consolidate loadOp and clear into a unioned field #283
- Need more reviews / opinions
Replace ArrayBuffers with garbage-friendly alternatives #261
- Merged.
Also add debug markers and group to the top level encoder #291
- Needs to update to use mixins, otherwise ok to merge
Remove Bit/Bits from bitmasks #292
- JG will make a PR to discuss the alternative of adding Bit / Flag to all these types
Rename GPUSampler.compareFunction to compare #293
- Merged.
Change GPUColor/Origin3D/Extent3D to sequences #299
- No consensus there but no super strong opinion either.
- JG will make a PR that will make these union types of dict and sequence
createBufferMapped*: don't require MAP_WRITE #300
- Merged.
Add GPURenderBundle #301
- D3D12 team discourages using bundles, concerns that GPURenderBundle won’t be able to benefit from all driver optimizations.
- Needs more discussion.

Tentative agenda

PR Burndown
Agenda for next meeting

Attendance

Apple
- Dean Jackson
- Justin Fan
Google
- Austin Eng
- Corentin Wallez
- Idan Raiter
- Kai Ninomiya
- Shrek Shao
- Ryan Harrisson
Intel
- Yunchao He
Microsoft
- Rafael Cintron
Mozilla
- Dzmitry Malyshau
- Jeff Gilbert
Mehmet Oguz Derin
Timo de Kort

Administrative items

CW: Should be possible to co-host with the Khronos F2F.

CW: You are required to be a Khronos member, so please let us know if you want to attend in person but are not a Khronos member.

CW: Please make PRs for anything you want before the MVP. We seem like in a good place coming out of the F2F, but please add anything new you want to change.

PR Burndown

Add some more defaults in AttachmentDescriptors #268 or Consolidate loadOp and clear into a unioned field #283

KN: Nothing new here.

JG: If no one has anything new, let’s keep pushing it in the issues.

KN: I think we just need more reviews on #283.

Replace ArrayBuffers with garbage-friendly alternatives #261

CW: All of these issues aren’t really contentious, but we need people to comment on them.

CW: Dean, Rafael, please say you have no objections if so.

RC: Will take a look.

CW: Do you think we should try to close it now?

JG: Yeah, I think it is a good forcing function. It’s a simple change.

KN: The most important thing is to avoid cloning the data as you move it in. It needs to be a view into the heap. It could be ArrayBufferSource or Uint32Array.

RC: Why can’t we use Uint8Array?

KN: SPIR-V is always in 32-bit words.

JG: I think we should say ArrayBuffer or a View or DomString

KN: Uint32Array is defined to be 32-bit aligned, so it makes sense for the browser.

JG: Let’s merge it.

RC: Fine with me.

DJ: ok with me too (typed)

Resolution - accept #261.

Proposal for copyImageBitmapToTexture #266

CW: I think we’ve already discussed this, but Kai was going to update something?

KN: Oops. Haven’t looked in a while. Only Jeff has reviewed.

KN: I’ll make changes and then we should be able to merge it.

RC: Did we decide when the copy will happen? Command list time or submit time?

CW: It’s at submit, and a validation error if you detach by the time you need the copy.

CW: OK, Kai will look at the minutes where we discussed and update.

Also add debug markers and group to the top level encoder #291

JG: How will this be used?

CW: You’re rendering to a bunch of cube maps and you want to debug which cube map

CW: Tools like Spector.js

RC: Spector won’t be able to access unless we add getters

DM: Don’t we want these on passes too?

CW: We already have this at the ProgrammablePassEncoder object. Not sure if we can do multiple inheritance in idl. If it’s fine, then I can update the PR with that.

JG: Seems fine to either merge it as is or move it to an interface.

DM: I thought we already had multiple interface inheritance.

KN: Pretty sure idl doesn’t have that because Javascript doesn’t have it. It does have mixins which are not exposed to Javascript.

CW: I’ll update to mixins in merge it. Sounds good?

RC: Sounds good.

KN: JS can’t know if a mixin is used, but it can be used in IDL to fake multiple inheritance.

DM: So Javascript wouldn’t be able to mess with overriding mixins?

KN: It can’t do every instance of the mixin at once, but it can do each user separately.

Remove Bit/Bits from bitmasks #292

JG: Why are we doing this?

CW: Consistency.

JG: I like it when bit flags are clearly bits/flags. Do people think it is redundant?

RC: So you’re suggesting adding bits or flags in other places?

RC: I have a feeling this will all go down in flames when we hit tag review. I don’t think they’ll like bitflags. But if we are, then they should be explicitly named.

JG: OK. I will do the reverse of this pull request.

CW: OK, but do not use “s”.

DM: This might be a module someday.

KN: We can’t put constants in namespaces. But maybe that will change.

KN: Until then we should capitalize it as if it is an interface.

DM: The problem with the singular name is that “all” and “none” are not singular. Native APIs use plural.

CW: OK. I just want consistency.

JG: I’ll file a PR and we can discuss it.

Rename GPUSampler.compareFunction to compare #293

CW: Simple rename for consistency with stencil and depth descriptors.

RC: No strong opinion. The type is already called CompareFunction.

CW: Alternative is to rename the others to compareFunction.

JG: Is “func” or “Fn” out?

CW: Yes

Resolution - accept this PR, remove “Function”.

Change GPUColor/Origin3D/Extent3D to sequences #299

KN: People have already expressed they don’t like it, so maybe we should just drop this one for now. So, I did this because I found it very annoying to write the dictionary literals every time I wanted to create a GPUColor/Origin3D/Extend3D. But I suppose we can have a helper function for these as well. Thought it would be slightly more ergonomic without explicit labels.

CW: One argument for sequences is it helps interop with other Javascript libraries that already store positions as vec3 using an array.

JG: Can we enforce lengths?

KN: Yes, we can spec that, but we can’t webidl it.

DM: I find these semantics to be weaker and the ergonomics is solved by simple helper functions.

RC: I wonder if the TAG will say to just use DOMPoint

(several people don’t want to do this because it is double, not int)

CW: So it seems people don’t want to merge this?

JG: I am tentatively in favor.

RC: We can merge it if you wish.

DJ: I prefer to keep it as it is. It isn’t a huge deal either way though.

JG: Is it more/less palatable to add yet another union type?

DJ: It’s a shame it can’t just accept an array

JG: I think that’s effectively what a union type would let us do.

KN: The other possibility is if we make it an interface with a constructor. Then, there’s the possibility of allowing that, or the Init dictionary for that Interface.

DJ: I was hoping you could do something in-between that allows an init dictionary without exposing a new interface.

KN: Sounds pretty much the same as what we have now.

DM: Does everyone agree that constructing new objects here is not ideal?

KN: Dictionary and array literals are also new objects.

JG: But you can keep the dictionary around.

DM: From WASM, if it’s a dictionary you just make a struct and pass it. Is it different for sequences?

JG: I don’t think it matters. It depends what the compiler wants to do. Even if you have a 3-part array, it may just pass it as a struct. I’d like to see the PR for the union type. Kai would you like to do that?

KN: Would you mind doing it?

JG: Sure.

createBufferMapped*: don't require MAP_WRITE #300

CW: The assumption was that we didn’t require MAP_WRITE, for allowing the minimal number of copies.

JG: This is my fault because I wrote the example code that uses MAP_WRITE. My only concern is that when the user uses createBufferMapped and doesn’t specify MAP_WRITE, it might look to them it’s the same as createBuffer without MAP_WRITE.

CW: I think the intent is that the buffer is as-if it’s created without the MAP_WRITE flag, or there’s staging data, or it happens to be a UMA systems. It’s very non-normative though.

DM: It’s not entirely clear to me how this optional flag (MAP_WRITE) helps you. Kai said you can have mapped buffers that are only mapped initially, and mapped buffers that can be mapped later.

CW: In the case of UMA where you want a GPU-side vertex buffer, CreateBufferMapped can give you a pointer to the buffer directly. On discrete, you can get a pointer to staging data. So this is always the optimal number of copies.

DM: ...

CW: But then you introduce that buffers can have multiple backings at the same time.

DM: On discrete, you CreateBufferMapped and you get something that has multiple backings

CW: No, you get staging only for initialization. This way users can get the optimal # copies without exposing IsUMA().

JG: …

KN: Another idea I had. It’s weird that you can CreateBufferMapped without MAP usage, but then you Unmap something without Map usage? We could have something like CreateBufferWithData. I’d be fine with merging and doing that later.

CW: Anyone against merging?

JG/DM: sounds good.

Add GPURenderBundle #301

CW: This is the RenderBundle thing we discussed in the F2F, with the update that it can only encode certain commands. This is so that it can be limited as bundles on every platform, specifically Metal.
Rafael, you had feedback from D3D about bundles?

RC: Yes, I’ll respond in the PR. But D3D discourages us from using real D3D12 bundles. There are also problems with reusing command lists. It’s meant to be static so some drivers will inline stuff from the command list. Then if you change descriptors and re-use command lists, then there may be some performance cliffs.
I agree if you take everything and make a new command list every time, then the optimizations will be fine.
tldr; D3D team discourages real bundles. And if command lists are too flexible, then we prevent drivers from doing some optimizations.
DM: So if we use bundles, drivers will never promote to root constants?
RC: When you make the root signature, you have to tell D3D how static things will be. You can say the descriptor and the pointee are static, or you can change either, or you can change only what it points to. Changes how the driver optimizes and inlines.
For bundles, they haven’t been implemented very efficiently in drivers and sometimes can be worse.
JG: Really good info. Sad, but really good. Punt on this for now?

CW: I think it’s great feedback, but I think it doesn’t change that we still need GPURenderBundle for multi-threaded recording and reordering.
JG: So not bundles, but more like sub-command lists?
RC: So you want to merge for as few Javascript calls as possible?
CW: That’s the intent for our implementation, yes. But others can be different.

DM: It’s good to still expose these because they’re available in native APIs.

CW: You can record parts of a render pass on multiple threads and then put them into a command buffer in different orders.

RC: I had thought that command lists would be our multi-threaded boundary lines, not bundles.

CW: Both. Difference between D3D12 and WebGPU is that there’s a difference between inside/outside of render pass, so we need two places to allow multithreading.

RC: But no multithreading within a bundle, right?

CW: Yes.

RC: Advantages of using bundles and command lists instead of just command lists?
CW: Say you have a render pass with 10k draw calls, you want to record that render pass in parallel. Is that okay?

RC: That makes sense. I just worry that we’re hiding; this forces us to make everything virtual on the other side, and thereby preventing developers from having access to the native command list constructs by allowing implementations to do too many things behind your back.

CW: What you’re describing is just an implementation choice of Dawn. Bundles should be representable by D3D12 bundles, secondary command buffers, etc.

RC: But if you do use a real D3D12 bundle, then you have to make everything volatile?

CW: Yes, but that’s an implementation detail.

DM: Our bind groups are currently immutable, so it’s not a completely volatile situation.

RC: But if a developer changes a constant buffer then it can’t be optimized. Unless we make buffers immutable as well.

JG: But at least this allows applications to have access to native concepts and the implementation can have some heuristics to decide what to do.

DM: I remember having a flag with bundles that specifies if the bundle is immutable or not.

CW: D3D12 optimization relies on implementation not changing uniform buffers, and that’s hardly something we can enforce or expect.

RC: The only way I can be talked into doing this is if we cannot overcome the Javascript overhead. I do appreciate skipping the Javascript overhead.

CW: Do you have an alternative to do reordering and multi-threaded recording?

JG: The alternative to this is for BabylonJS to do the multithreaded recording in userland and turn the results into WebGPU calls.

CW: But this doesn’t allow multithreaded recording on the GPU process side. It’s nice to be able to have chunks of work that can be done independently.

JG: But is that because we’re not ready to ship yet and don’t have all the performance bugs worked out?

CW: Yes, but apps are going to push for it more and more. I agree it’s very early, but it’s one data point.

JG: I would like to defer bundles for now.

CW: Okay, please comment to discuss this more in an issue or PR.

KN: If a draw call goes out of bounds, we allow it to be clamped, but should we also allow no-op? Useful for compute-shader-based draw call validation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly