Skip to content

Minutes 2019 01 23

Corentin Wallez edited this page Jan 4, 2022 · 1 revision

GPU Web 2019-01-23 San Diego F2F Day 2

Chair: Corentin

Scribe: Ken with some help

Location: Google San Diego, Google Meet

TL;DR

  • Compressed textures #144
    • Each implementation will be required to support at least on compressed texture format likely between BC and ETC
    • Do like WebGL and define how compressed texture formats work in the main spec.
  • Creation of swapchain:
    • Discussion on whether the application should get notified on a change of monitor and how it works on native platforms. The browser could handle the monitor change by losing the application’s device.
    • The size of the swapchain / drawing buffer is controlled by the canvas .width and .height arguments.
    • Discussion on how color spaces and bit depths should be handled for WebGPU.
    • Collaborative work then consensus on the IDL for swapchain creation.
  • Mutable texture formats:
    • Concern that allowing reinterpretation of the texture formats could reduce perf.
    • D3D12 only allwos reinterpreting to formats with the same per-channel bit depth
    • Should reach to IHV and benchmark to know what the perf drop will be.
  • Corentin presented slides about the buffer mapping proposals
    • mapAsync stays close to buffer mapping in native while “uploadBuffer” tries to abstract the notion of staging memory from applications.
    • Consensus to go forward with mapAsync with the assumption that it will be followed by the introduction of the buffer sub range concept.
  • Semantics of resource binding: #167
    • In Metal vertex buffers and argument buffers for the vertex stage fight for the same “register space”.
    • Not an issue because WebGPU can’t require support for more than 4 bindgroups and 16 vertex buffers at the same time.
    • Discussion around matching shaders and pipeline layouts.
  • Error handling:
    • Need to distinguish between adapter and device loss.
    • Need to have synchronous error reporting when the devtools are open.
    • More discussions about faillible object creation and OOM.
  • Discussion on how vertex buffer and attributes should be worded and interface indices worded in the IDL. No consensus before the end of the meeting.

Tentative agenda

  • Shading language
  • Memory model
  • Plans for the CTS
  • Plans for the spec work
  • Multi queue
  • Compressed textures
  • Buffer mapping
    • #147 - mapAsync
    • #154 - commandBuffer.uploadBuffer
    • #156 - buffer sub ranges
  • Creation of the swapchain
  • IDL-athon: see the snapshot tracker
  • Produce an example of a WebGPU program
  • Vertex buffer namespacing #167
  • Review draft WG charter #15
  • Agenda for next meeting

Actual Agenda

  • Compressed Texture DONE
  • Creation of swapchain DONE
  • Mutable texture format #168 DEFERRED
  • Buffer mapping DONE
  • Semantics of resource bindgroups #167 DONE
  • Error handling changes DONE
  • An example of a WebGPU program DONE
  • Different types of indices (input slots, etc.) and how you pass sequences of these. Identified four different types of indices so far. DONE
  • IDL-athon: see the snapshot tracker

Attendance

  • Apple
    • Dean Jackson
    • Justin Fan
    • Myles C. Maxfield
    • Robin Morisset
  • Google
    • Austin Eng
    • Corentin Wallez
    • James Darpinian
    • Kai Ninomiya
    • Ken Russell
  • Intel
    • Bryan Bernhart
    • Jiajia Qin
    • Jiawei Shao
    • Yunchao He
  • Microsoft
    • Chas Boyd
    • Jungkee Song
    • Rafael Cintron
  • Mozilla
    • Dzmitry Malyshau
    • Jeff Gilbert
  • Joshua Groves

Compressed Textures

  • MM: moving the extension we proposed to post-MVP is OK with us.
  • DJ: this leaves us with extensions per compressed texture type.
    • Did we suggest that each implementation should support at least one compressed texture format?
  • CW: #144
  • JG: how do you feel about devices that have PVRTC and not ETC?
    • MM: are we talking about the set of which extensions are required?
    • CW / JG: yes
  • JG: we need at least BC and ETC. The only thing that leaves unsupported are some iOS devices.
    • CW: ETC is supported on basically all iOS devices.
  • MM: this list tells developers which compressed texture formats they have to support. It doesn’t restrict whether we can add optional extensions for e.g. ASTC.
  • JG: I think we can get away with BC and ETC as the required set.
  • CW: technically Vulkan doesn’t mandate support for these, but in practice all hardware supporting ASTC also supports ETC.
  • KN: may not be true forever.
  • CW: we can revisit it if it’s ever true.
  • CB: pulling a texture format out of your chip is hard for compatibility. ASTC has such a large hardware footprint that ETC is essentially free.
  • CW: suggest we support compressed texture formats in the main spec, but don’t define any. So that the extensions for compressed texture formats don’t have to add that support to the spec.
  • KR: sounds good. That’s the way it was done in WebGL.
  • JG: problem with GLES is that it gives constraints on compressed textures, and the extensions all override those constraints.
  • KN: don’t know whether we have to say they’re block-based.
  • CW: block-based is important for transfer between buffers and textures, because you can only transfer whole blocks.
  • JG: possibility of 1x1 blocks.

Discussion about closing investigations rather than leaving them open.

Creation of swapchain

  • CW: What is the “getContext” string argument for WebGPU?
    • What is the object that you get and what does it do?
    • What are the context creation attributes?
    • What happens when the canvas resizes?
    • CW: you can watch for resize events…
  • MM: if you drag the browser window from one monitor to another one on another GPU, what happens?
    • CW: discussed this at the last F2F?
    • MM: if there’s a call where you can tell the SwapChain to resize, should it also take a WebGPUDevice so that you can select the GPU?
    • DM: suggest we lose context.
    • RC: don’t think we should auto-move. Lose context instead.
    • KN: system compositor may be able to handle it.
    • MM: system compositor does handle this on macOS, but it consumes more power.
    • DM: so it’ll use the old GPU?
    • MM: yes.
    • KN: would be nice to have a signal saying “you changed monitors”, and a way to figure out the ideal device for this monitor. I prefer this to losing the context.
    • RC: it’s still iffy. The rest of the web page content has to move with you. 360 video rendered on the “other” GPU will induce expensive copies. Developer has to know what’s happening.
    • JG: even a native app would have to know that it’s using multiple GPUs.
    • RC: when we move monitors we should keep everything on the main one. Should only fire the “lost” event when it’s moved all of its contents to the other device.
    • MM: in Safari, when you drag a window to another monitor, we migrate the resources.
    • CW: it would be unfortunate if all developers had to handle the case of rapidly dragging the window from one screen to another.
    • MM: another option: you don’t move GPUs, and if the browser detects through a heuristic that it would be beneficial for the app to move, it sends context loss.
    • RC: SGTM. Suggest you do that when the contents of the rest of the page moves as well. A handful of images isn’t bad, but many WebGL apps have loading screens for lots of resources. Should happen on the rare occasion, not the common case.
    • CW: it’s the rare occasion on Windows, but more common on Mac.
    • RC: think web developers should handle this, even if it only happens more on Windows than Mac.
    • MM: ok. Ultimate question is, should there be a way to take an existing SwapChain and plug in a new device? Sounds like no.
    • CW: agree, you create a SwapChain for a specific device.
    • KN: per HTML spec can’t get a context for more than one canvas. Same exact object each time.
    • RC: WebGL loses all objects during context loss. Could do the same thing here.
    • CW: are we saying we’re OK losing the device when moving window to a different screen?
    • RC: depends on what the browser wants to do.
    • MM: Part of the contract in Metal that you have to handle these devices
    • RC: What happens if you don’t handle that?
    • JF: A blit happens and lower perf.
    • MM: you’re not supposed to leave your resources on the other device, kills battery.
    • MM: another q: in D3D and Vulkan can you plug an arbitrary SwapChain and Device to a RenderPass for another device?
    • RC: no. Blitting is something the compositor does.
    • CW: Vulkan and D3D12 assume you can create a SwapChain for any window and any device, and the system compositor will make it work. No idea of affinity of windows for specific screens and devices.
    • CB: you can figure out which one you’re on, and you get a notification event on some API versions. You can ignore it and live with the copy, though.
    • MM: gaming rig, two GPUs. Have a window. How do I know what D3D device to create on?
    • CB: you can enumerate everything, but there are defaults. Has to do with order of enumeration. First is generally best.
    • MM: if I drag window from monitor to another one, does list of adapters rearrange itself?
    • CB: no. Runs you on the old GPU, does a copy. Happens if spanning 2 monitors, too.
    • MM: when you get notification how do you respond?
    • CB: you get monitor change. Can traverse, figure out if you want to go to new GPU.
    • MM: so you iterate down device list and match?
    • CB: yes. They try to keep the topology graph up to date.
    • RC: on hybrid machine, adapter 0 is the integrated one, but there are overrides, like the NVIDIA discrete control panel.
    • KR: Wasn’t there a discussion about listing the adapters etc?
    • CW: Different discussion.
    • KR: Both things should happen, applications should support device loss but also browsers should avoid sending device loss if they can.
    • KN: How about background tabs? Device loss for that?
    • MM: WebAnimation spec solves this by periodically destroying the worklet so people have to handle that.
    • DJ: Same for service worker with devtools open to force developers to handle the case.
    • DM: we have hints for low-power/discrete. Doesn’t affect which GPU presentation is done on. Should we expand this with the presentation information somehow? If user requests low-power and they’re using the external GPU, there’s a blit.
    • KN: you’re running the discrete GPU anyway in that situation. May just decide to group things.
    • RC: low-power is a request anyway.
    • JD: low-power only has an effect when multiple GPUs can render to the same monitor.
    • KR: Resize events are very asynchronous - ResizeObserver is the plan.
    • JG: suggest we don’t forcibly resize the SwapChain. Let the developer choose to do it when they want.
    • KN: when the canvas DrawingBuffer size changes?
    • JG: the DrawingBuffer is the SwapChain. I suggest when the canvas resizes, we don’t do anything and ask the developer to handle it.
    • CW: OffscreenCanvas has width/height and you can control the size of your DrawingBuffer with it.
    • JG: the problem is, from the main thread, your CSS rules might cause changes to your canvas width/heihgt.
    • DJ: the canvas width/height is the drawing buffer size. CSS only says how it’s displayed.
    • CW: canvas width/height controls the SwapChain’s width and height.
    • DJ: we don’t even have to specify this. That’s how it works. The only thing that’s weird is this is the first time we call getContext and it’s separate from the thing you draw with.
    • JG: thought we had a plan to decouple these.
    • KR: thought so too.
    • CW: The swapchain and the canvas are essentially the same thing, but the device is separate.
    • JG: could be, request WebGPUSwapChain from canvas, and do ordered approach to swapchain based on arguments. For example, you don’t want to use the width and height.
    • KN: the texture you get back from the SwapChain will have the canvas’s width and height.
    • CW: app definitely needs to control width/height.
    • JG: what else is the app controlling? Number of images it has?
    • CW: Things the app controls:
      • Width/height of rendering: yes.
      • Number of images: no. (In Metal, you ask for the “next one”.)
      • Format: yes. Has to be compatible with what the other web specs are doing, like RGBA8, RGBAF16. (Colorspace is different.)
        • Have to know the format to create the RenderPipelineDescriptor.
        • KN: format we request for SwapChain doesn’t have to be a WebGPU texture format. If you request HDR then we’ll give you this format, if not, this one. Have to have a fixed list.
        • RC: think the format should be something you request.
        • DJ: platform tells you if it supports more than sRGB. Should check to make sure browser compositor can handle all of these types of resources.
      • usage: yes.

Discussion about color spaces and pixel formats

* KR: current (?) [canvas color space proposal](https://github.com/WICG/canvas-color-space/blob/master/CanvasColorSpaceProposal.md)
    * Our thing should be compatible with what canvas specs.
    * More links:
    * [https://github.com/whatwg/html/issues/299](https://github.com/whatwg/html/issues/299)
    * Older (?) [Canvas color space proposal](https://github.com/zakerinasab/canvas-color-space/blob/master/CanvasColorSpaceProposal.md)
    * DM: Isn’t CAMetalLayer always BGRA?
    * MM/CW: Yes.
    * KN: Windows too?
    * RC: Can choose.
* JG: Should expose RGBX distinctly from RGBA. Guaranteed opaque.
* CW: Is it fine if applications request RGBA8 and we have to convert to BGRA8
    * JG: No.
    * CW: Only affects fullscreen?
    * JN: Overlays/underlays too.
    * BB: If you give the image to the compositor, it must be the right format.
* Should we always tell the application the system’s preferred format?
    * CW: Concerned that some weird future system wants a new format like RGB9E5. Then applications wouldn’t be able to handle it.
* JG: this isn’t too different from requesting 32-bit float render targets and figuring out how to handle it.
* KN: if impl wants RGB9E5 then it’ll tell the app the system wanted RGBA8 and we’ll do the conversion. Can start with a set of formats. Maybe guarantee one or two work.
* Discussion about “emulating” swap chains, and performing a final blit on the way to the browser’s compositor.
* KR: Dug up the canvas colorspace proposal and it is stalled. People moved on from canvas2D to something different. Consensus was that you needed only RGBA8 and RGBA16F. We can do that in this group and push these spec forward. Keep it simple and let the browser fallback. Also suggest we let the GPU swapchain expose what its preferred format is.
* CB: This works for Windows, but Apple has 10 bit formats.
* KR: In WebGL, the RGBA data gets turned into BGRA by driver magic. (Shader writes a vec4, driver stores it as appropriate for the render target.)
* KN: Could have opaque framebuffers ish concepts.
* MM: suggestion to have a preferred format which the SwapChain can adapt to.
* MM: presumably the preferred format will change if you drag the monitor from one window to another? Do we do tone mapping, or do we lose the context?
* MM: what about Kai’s idea of a sentinel value as the pixel format passed to getContext?
    * AE: it’s the same thing just phrased another way. \


        ```

interface GPUCanvasContextOptions { CanvasColorSpace colorSpace; CanvasPixelFormat pixelFormat; // To be discussed with WICG };

interface GPUCanvasContext { };

partial interface GPUDevice { Promise getSwapChainPreferredFormat(GPUCanvasContext context); // Calling createSwapChain a second time for the same GPUCanvasContext

// invalidates the previous one, and all of the textures it's produced.
GPUSwapChain createSwapChain(GPUSwapChainDescriptor desc);

};

dictionary GPUSwapChainDescriptor { required GPUCanvasContext context; required GPUTextureFormat format; GPUTextureUsage usage = GPUTextureUsage.OUTPUT_ATTACHMENT; };

const context = canvas.getContext('webgpu', {colorSpace: "rec2020", pixelFormat: "8-8-8-8"}); const preferredFormat = await device.getSwapChainPreferredFormat(context); const swapChain = device.createSwapChain({ context: context, format: preferredFormat, usage: // something });





## Mutable texture format



* Mutable texture format [#168](https://github.com/gpuweb/gpuweb/issues/168)
* JS: Essentially, typed vs. typeless.
* CW: 
    * All textures have mutable format, or they don’t.
    * We figure this out before the snapshot, or after.
* DM:
    * Suggest color format has to match. Figure out what to allow later on.
    * Depth-stencil: allow to select one or the other.
    * CW: isn’t this the aspect flag?
* CW: Q of what we can allow safely. If you allow mutable formats, it prevents compression on some hardware. Probably don’t want to do that for every texture.
* CB: will be cases where something you do will make things run less than optimally on some chip. Probably can’t fix all of these. The opaque lossless compression format: have seen cases where it’s not a win. In some cases it’s a regression. Not something I’d use as a major focus for redesign of an API. Some of it’s because it’s new technology. HW engineers work on a technique and want you to use it.
* MM: have had similar experience when talking with HW vendors. Our strategy is usually to talk with a HW engineer, but then measure yourself.
* DM: suggest we not modify texture usage. Instead, add flag set of texture capabilities. In the future, would say we can produce a format of the view that’s different, but for now, don’t support such a flag.
* JG: then we don’t have the ability to recast these textures. Suggest we always have the ability to recast and have the optimization that we forbid recasting of textures.
* CW: requires complex investigation of what can be cast into what.
* CB: could probably simplify the D3D12 table if we could call IHVs and find out what they really care about supporting right now.
* JG: should we be able to view a RG16 format as R32?
* CB: yes.
* CW: same bits, just reinterpret_cast.
* JG: what are Vulkan’s rules?
* RC: does the API let you do this, Chas, or is this future?
* CB: not sure. 12 API may have holdovers from 11 API.
* RC: let me ask others.
* CW: so we say all texture formats are mutable / castable?
* JG: create compatible view on the same buffer.
* BB: DX silently fails, but only the Debug layer tells you. Vulkan sends you an event if you fully type a resource and can’t cast it. (In the validation layer)
* CW: need to define the compatible texture classes properly.
* DM: need to define the reinterpret_cast behavior. Vk spec says impl is free to round one way or another.
* JG: this is reinterpret_cast, not static_cast.
* MM: Metal docs actually use the word “reinterpret”.
* CB: if you do any operation on that value - bind as compute buffer, load 32 bits in - I’ll see the original bits. If I create a sampleable surface format and sample it as a float, we give you the floating-point NaN behavior that’s spec’ed.
* DM: my concern isn’t valid because you can already upload NaNs.
* CW: so, no concern, and texture views can reinterpret_cast between formats?
* MM: I’d like to see what the API looks like. Will all bindings take texture views? etc.
* CW: we’ll still have a GPU texture format in there, because we need the bit depth. Then you create a TextureView. You pass a format, defaults to what you created with, but you can change it.
* MM: what can you do with a texture that’s not created with texture view?
* CW: you can copy to it.
* CB: you can use it in a compute shader.
* MM: way you attach resources to compute pipeline should match render pipeline.
* CW: MM is your concern that there are resources, and views? Or Metal way, where it’s both a memory allocation and view? And creating new views involve “parent texture”?
* MM: yes. So, when allocating view, you choose its format.
* JS: should decide if you want to allow this when creating a texture.
* MM: we’re basically saying we want to allow this for all textures.
* CW / DM: disagreement about exactly the equality classes for casting.
* CW: don’t have to decide the classes now. Have to decide: shall we have such classes, and should we specify whether we might want to reinterpret textures at their creation time?
* MM: we should measure the perf penalty of creating textures supporting multiple views.
* DM: in D3D12 the classes depend on the views you create. We don’t even expose UAV, SRV, etc.
* CW: could be minimum set of different views.
* DM: maybe we should add usage to TextureViewDescriptor.
* CW: yes. We need the tables Rafael will provide.
* KR: see [https://docs.microsoft.com/en-us/windows/desktop/direct3ddxgi/hardware-support-for-direct3d-12-0-formats](https://docs.microsoft.com/en-us/windows/desktop/direct3ddxgi/hardware-support-for-direct3d-12-0-formats) linked from [https://github.com/gpuweb/gpuweb/issues/168](https://github.com/gpuweb/gpuweb/issues/168)
* CW: punt on this for now then?
* JG: the test will be difficult to write. Would talk with an IHV first before writing the test. Testing randomly will likely give inconclusive results.
* BB: would it help if Intel provided feedback?
* MM: yes. Also would be helpful if you’d help write the benchmark.


## Buffer mapping



* [Slides](https://docs.google.com/presentation/d/17SC5k5NSfGUGWAHQjL8dD3P-dgTSuaLd9y4sADs3-gk) by Corentin
* DM: basic idea of #154 (upload/downloadBuffer) is to eliminate the notion of staging buffers from the user.
* DM: in D3D you can’t render into staging memory.
* CB: sometimes UMA isn’t a win because the CPU and GPU may have differing cache policies, necessitating a copy even there.
    * In D3D12, if you’ve allocated a heap, then yes, you shouldn’t be subject to that kind of magic.
* CW: there are situations where you need to upload a bunch of uniforms during a frame. To avoid races, have to upload to staging buffer and then schedule a copy into the uniform buffer.
* CW: with mapWriteAsync (#147), we expect earlier frames are handing you back buffers to re-fill. If you need a buffer *right now*, you can call createBufferMapped, which will give you both a buffer and ArrayBuffer to fill.
    * setSubData makes things easier, just not as performant.
* RC: how many frames will it be before the Promise from mapWriteAsync resolves?
    * CW: not specified. Depends on the pipeline. Maybe 1-3 frames.
* KR: what happened to MAP_WRITE_SYNC?
* CW: replaced with createBufferMapped.
* KR: see this being the biggest divergence with the native APIs.
* CW: explanation of intended use and how ring buffer would be implemented on top of these
* (break)
* CW: we’re really trying to decide between mapBufferAsync + BufferSubRange, and uploadBuffer
* DM: my proposal is more on the simple side, and wouldn’t really evolve toward the BufferSubRange proposal
* JG: I prefer to avoid allocators with heuristics. Think uploadBuffer has scalability problems for the future.
* RC: I prefer a way to get an ArrayBuffer where you can put your stuff, now.
* CW: that’s one of the classes in BabylonJS.
* RC: I want the API to let you do that. If you didn’t have createBufferMapped, you wouldn’t have that ability.
* CW: that’s why it’s part of the proposal.
* RC: I’m OK if mapAsync evolves toward BufferSubRange.
* CW: BufferSubRange is just an extension of mapAsync.
* MM: can mapAsync and BufferSubRange coexist?
    * CW: yes, by design.
    * MM: if there’s a part of a buffer that’s not mapped, there could be an API that says, populate it with this data.
* JG: sub-ranges vs. simple uploads.
* MM: can those coexist?
* JG: yes, but you shouldn’t.
* KR: you can build one on top of the other.
* DM: ideally my proposal relies on the tight integration of IPC. Careful management of shmem, buffers. If you know how to manage it through sub-ranges, great. Will it have the same # copies?
* CW: yes we can. Only challenge: need a general allocator.
* JG: what are your objections to subranges?
* CW: not exactly a pit of success. Developer might “await mapAsync” in rAF and that wouldn’t do what they expect. If you come to WebGPU you’ll likely do things wrong. setSubData is for these people.
* CB: the higher-level APIs should be in the higher-level frameworks.
* RC: does BufferSubRange give you an ArrayBuffer you can write immediately?
* ...more discussion…
* JG: would like to propose that BufferSubRange be our goal.
* CW: suggest we take #147, and right after the snapshot, we add subranges.
* JG: I like that.
* CB: agree.
* KR: agree.
* DJ: yea.


## Typeless Textures



* RC: [https://docs.microsoft.com/en-us/windows/desktop/direct3ddxgi/hardware-support-for-direct3d-12-0-formats](https://docs.microsoft.com/en-us/windows/desktop/direct3ddxgi/hardware-support-for-direct3d-12-0-formats)
* RC: the list’s ordered. From a given typeless one, you can cast to the typed ones below it, up to the next typeless one.
* RC: D3D engineer says he’s not aware of advantages of using typeless, but clever people could take advantage of them.
* JG: all of our shaders have support for packing/unpacking things nowadays. Like RGBA8 from uint32.
* CB: problem with writing it in the shader is people are used to writing code outside the shader to do it. Annoying that 10-bit and 8-bit things are in different families.
* JG: there’s some of that in Vulkan, too. Not as friendly as Metal.
* RC: could we add the typed ones for now, and if we need typeless ones we could add them later?
* CB: we could loosen things up, but then have to complain to Vulkan and D3D engineers to make things as nice and simple as Metal.
* JG: D3D’s the most restrictive. Vulkan’s most of the way to Metal. Metal’s very friendly. (Supports fewer formats, maybe?) Maybe Vulkan just has to be equal to Metal for all formats Metal supports.
* MM: thought we reached consensus.
* JG: consensus earlier was that we’d wait for Rafael to get back to us about compatibility families. We do have those.
* MM: are we going to revisit the typed/typeless format distinction? And view has a non-typeless format?
* JG: Metal has implied compatibility.
* MM: we don’t have to decide this now.


## Semantics of Resource Binding



* [#167](https://github.com/gpuweb/gpuweb/issues/167)
* MM: in Metal, the same mechanism is used to describe vertex argument buffers and resource bind groups.
    * Vertex buffers: attach VB to vertex shader, using SetVertexBuffer.
    * Argument buffers: Metal rep of argument buffer is a buffer, so also use SetBuffer call.
    * Indices representing vertex buffers are same as bind group indices.
    * If we did nothing, it’d be bad. SetVertexBuffer(0) and CreateBindGroup(0) could not coexist.
    * 3 possibilities:
        * Surface requirement in WebGPU spec, tell programmers not to do bad thing.
        * Pack the vertex buffers and bind groups tightly together.
        * Don’t pack them tightly, pack them sparsely, so difference between option #2 and #3 - easy to figure out Metal index of bind group. Just add a constant, which is written down in the spec somewhere.
    * CB: or grow from opposite ends.
    * MM: yes, if you have a maximum.
    * CW: we don’t surface this constraint at the API level. Impl does whatever it wants.
    * MM: both options 2 and 3 are compatible with that. (2) has API constraints of order of calls. (3) requires limits on sizes.
    * MM: if size limits are small, (3) is probably better. If 100,000s of these, (2) is better.
* CW: WebGPU won’t guarantee more than 4 bind groups because of Vulkan. Would only need 4 Metal argument buffers, maybe 5 if you want fancy stuff. Not sure what we did in Dawn - think vertex buffers first.
* MM: OK, so: put bind groups at indices 1-4, vertex buffers at indices 5+.
* CW: in Dawn we don’t use Metal arg buffers yet. They’re descriptor sets where the buffer contains the descriptor set, use encoder to encode descriptors in it.
* JF: if we reserve the first 4-8 for arg buffers, leaves us with ~23 buffer binding things. Is that enough to handle other vertex buffer use cases?
* CW: think so. Indirection goes on here.
* MM: can we pick a number?
* CW: in Dawn right how we have 16. Not sure about Vulkan.
* DM: MAX_VERTEX_INPUT_BINDINGS? 16 or more in the Vulkan spec.
* JF: any other restrictions from D3D? (yes, 16)
* MM: good. Similarly, what if you try to draw something but not all bind groups are set. Error? Map nothing?
* CW: error.
* DM: to clarify: things in the pipeline layout of the current pipeline?
* CW: yes.
* MM: do we need to set pipeline state before sending these bindgroups?
* CW: in Dawn, currently, no. Have bindgroup 1..3. Look at pipeline layout. Dirty things that have changed, or things after them.
* MM: are there matching rules?
* CW: descriptors are functionally equivalent. We de-duplicate ourselves. We need a proper spec.
* DM: don’t think this matches Vulkan inheritance rules. [#171](https://github.com/gpuweb/gpuweb/issues/171) is about spec’ing those rules.
* MM: shader source code needs to match pipeline layout?
* CW: probably has to be a subset.
* MM: OK. If it isn’t, has to be an error.
* DM: we’d error on pipeline creation.
* ...discussion about matching shaders and pipeline layouts…


## Error handling changes



* KN: presentation

    ```
enum GPUEventType {
    /// Signals that the GPUDevice cannot be used anymore. Any further
    /// operations on the device, or objects associated with the device, are
    /// errors.
    /// This may happen if the device is reclaimed by the browser, or something
    /// goes fatally wrong (e.g. crash or native device loss).
    /// (TODO: does it happen if the device is destroyed by the application?)
    "gpudevicelost",

    /// Signals when it is okay for the application to make a new GPUDevice.
    /// The original device (the event target) is still invalid (it does not
    /// become valid).
    /// This may only happen if:
    ///   - the device has received a "gpudevicelost" event,
    ///   - the "gpudevicelost" event had the "recoverable" flag set, and
    ///   - the "gpudevicelost" event's default action was not taken (preventDefault was called).
    "gpudevicerecovered",

    /// Fires when a non-fatal error occurs in the API - a validation error
    /// (including if an "invalid" object is used) or unhandled out-of-memory
    /// condition).
    "gpulogentry"
};

interface GPUDeviceLostEvent : Event {
    readonly attribute bool recoverable;
    readonly attribute DOMString? reason;

    /* void preventDefault(); */
};

interface GPUDeviceRecoveredEvent : Event {};

enum GPULogEntryType {
    "validation",
    "out-of-memory"
};

interface GPULogEntryEvent : Event {
    readonly attribute GPULogEntryType type;
    readonly attribute any object;
    readonly attribute DOMString? reason;
};

partial interface GPUDevice : EventTarget {};

// device.addEventListener("gpudevicelost", (event: GPUDeviceLostEvent) => {
//   event.preventDefault();
// });
// device.addEventListener("gpudevicerecovered", (event: GPUDeviceRecoveredEvent) => {
//   // The old device is NOT valid - create a new one.
// });
// device.addEventListener("gpulogentry", (event: GPULogEntryEvent) => {});
```

enum GPUEventType { /// Signals that the GPUDevice cannot be used anymore. Any further /// operations on the device, or objects associated with the device, are /// errors. /// This may happen if the device is reclaimed by the browser, or something /// goes fatally wrong (e.g. crash or native device loss). /// (TODO: does it happen if the device is destroyed by the application?) "gpudevicelost",

/// Signals when it is okay for the application to make a new GPUDevice.
/// The original device (the event target) is still invalid (it does not
/// become valid).
/// This may only happen if:
///   - the device has received a "gpudevicelost" event,
///   - the "gpudevicelost" event had the "recoverable" flag set, and
///   - the "gpudevicelost" event's default action was not taken (preventDefault was called).
"gpudevicerecovered",

/// Fires when a non-fatal error occurs in the API - a validation error
/// (including if an "invalid" object is used) or unhandled out-of-memory
/// condition).
"gpulogentry"

};

interface GPUDeviceLostEvent : Event { readonly attribute bool recoverable; readonly attribute DOMString? reason;

/* void preventDefault(); */

};

interface GPUDeviceRecoveredEvent : Event {};

enum GPULogEntryType { "validation", "out-of-memory" };

interface GPULogEntryEvent : Event { readonly attribute GPULogEntryType type; readonly attribute any object; readonly attribute DOMString? reason; };

partial interface GPUDevice : EventTarget {};

// device.addEventListener("gpudevicelost", (event: GPUDeviceLostEvent) => { // event.preventDefault(); // }); // device.addEventListener("gpudevicerecovered", (event: GPUDeviceRecoveredEvent) => { // // The old device is NOT valid - create a new one. // }); // device.addEventListener("gpulogentry", (event: GPULogEntryEvent) => {});



* JG: would like to see better distinction between “device” and “adapter” lost. Think getting the nouns right will make things easier to understand. The multi-application case is confusing.
* KN: agree, will think about it some more. Will think about event names and on where the event handler is registered.
* Discussion about synchronous error reporting mode while DevTools is open.
    * KN: API entry points would not be changed to throw. Would look more like a JS shim like webgl-debug.js.
* Discussion about out-of-memory handling.
    * JG: maybe Promises are a better solution.
    * KN: Probably. Think there was a reason I didn’t do this originally, will figure it out.
* Short discussion about pipeline creation.
    * CW: maybe you want to create a bunch of pipelines but you want to throttle the rate of pipeline creation. Watch for status change?
    * MM: probably want to enqueue more than 1 at a time.
* JG: large buffer allocations should be handled as expected-fallible?
    * KN: probably, should.
    * JG: and small buffers’ allocations should be considered infallible?
    * KN: if no handler, it’s that case.
    * JG: one thought, have expected-fallible entry point which is a Promise.
    * KN: I think that’s a reasonable solution.
    * JG: Monad version could expect success.
    * KN: could do that. If CreateBuffer runs out of memory, lose the device.
    * DJ: that’s what I think. If you create 100 MB texture and run out of memory, lose the device.
    * CW: e.g. if you’re Maps: you want to cap your allocations. Apps that want caches of things want to be able to create a heap (like Metal resource heap), and create resources from it. I know it doesn’t consume more than a certain amount. Makes people more comfortable.
    * RC: if we Promise-ify things, do we need the event?
    * KN: I don’t think so. Not even sure we need the status.
    * CW: would be nice to know when pipelines are done compiling.
    * DJ: if you want to know that per event, it’s either a Promise, or if an event, you’ve probably already used the Pipeline.
    * KN: do we want createPipeline and createPipelineAsync?
    * CW: think it’s OK.
    * DJ: sounds like we’re making the sync developer mode API as we’re going along.
    * KR: hope we won’t drop the Maybe monad versions of the APIs. Need it for WebAssembly compatibility, for example.
    * (Agreement: no intention to drop these.)


## Remaining Uses of Indices in the API



* CW: they’re all for the pipeline interface.
    * Vertex buffers
    * Color buffers
    * Pipeline layout
    * Binding inside bindgroup
* CW: Kai suggested to use VertexBufferIndex so we can at least find them in the IDL.
* CW: how do we pass these? Sparse array? Member of descriptor? etc.
* CW: certain things are cold, but setting some of them are hot paths.
* CW: want to be able to leave holes in VertexInputDescriptor, for example, so that you can have pipelines that are compatible with more programs.
* CW: same thing for color attachments. This is a really, really important use case for BindGroups.
* CW: these things need to be sparse somehow.
* CB: for vertices, best usage is structure-of-arrays. Positions in one array, colors in anoter buffer, essentially, so you never have to push a gap through the cache. Unfortunately if you have 3 empty DWORDs you essentially pull them through memory.
    * Obvious way to make this work: load only buffers you intend to use. Typical thing: position used by renderer, velocity + position used by physics, color used only by renderer, etc.
* DM: isn’t this an orthogonal issue?
* CW: when you set these things, they have an input slot in InputStateDescriptor.
* CW: have a sequence of things that don’t have to be in order.
* CB: position in array is not specified by slot in list.
* CW: this is what we have currently. Maybe the index in the array should be the index.
* ...Discussion, not minuted…
* KN: can’t remap per-thing because you want compatibility across multiple pipelines / programs, etc.
* ...Discussion about organization of vertex attribute / vertex buffer descriptors…
* Calling the discussion here. Will likely put sequence of VertexAttributeDescriptor into VertexBufferDescriptor.



* DM: Move attributes array into VertexBufferDescriptor.

    ```
dictionary GPUVertexAttributeDescriptor {
    u32 shaderLocation;
    u32 offset;
    GPUVertexFormat format;
};

dictionary GPUVertexBufferDescriptor {
    VertexBufferIndex inputSlot;
    sequence<GPUVertexAttributeDescriptor> attributes;
    u32 stride;
    GPUInputStepMode stepMode;
};

dictionary GPUInputStateDescriptor {
    GPUIndexFormat indexFormat;

    sequence<GPUVertexBufferDescriptor> inputs;
};
  • DM: Also remove VertexBufferDescriptor.inputSlot.

Action items

  • RC: talk with D3D12 folks about what’s possible in reinterpretation of textures’ contents between different view types.
  • BB: follow up with exact scenarios in weak vs. fully typed resources have perf impacts.
    • CW: some forward looking scenarios would help, too.

Resolutions

  • No meeting next week Monday 28 Jan. Next meeting 4 Feb.
  • Please add IDL change requests/ideas to the Spec tracker
  • We will assign tasks to people at the next meeting
Clone this wiki locally