Nahim shader defines #643

devshgraphicsprogramming · 2024-01-10T09:06:25Z

Description

Testing

TODO list:

include/nbl/builtin/hlsl/device_capabilities_traits.hlsl

devshgraphicsprogramming · 2024-01-13T22:41:44Z

include/nbl/builtin/hlsl/device_capabilities_traits.hlsl

+#define NBL_GENERATE_GET_OR_DEFAULT(field, ty, default) \
+template<typename S, bool = has_member_##field<S>::value> struct get_or_default_##field : integral_constant<ty,S::field> {}; \
+template<typename S> struct get_or_default_##field<S,false> : integral_constant<ty,default> {};


this macro causes Boost.Waveto throw exceptions, find out why

@nipunG314 does it still throw exceptions?

let me know when you finally get your project to compile

include/nbl/builtin/hlsl/device_capabilities_traits.hlsl

devshgraphicsprogramming · 2024-01-13T22:45:52Z

include/nbl/builtin/hlsl/device_capabilities_traits.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(nonCoherentAtomSize, uint16_t, 0);
+
+NBL_GENERATE_GET_OR_DEFAULT(subgroupSize, uint16_t, 0);
+NBL_GENERATE_GET_OR_DEFAULT(subgroupOpsShaderStages, core::bitflag<asset::IShader::E_SHADER_STAGE, 0);


obviously there's no bitflag on hlsl, so use a plain uint8_t

just a small annoying note, not every bitflag is a uint8_t

maybe see if you can make a nbl::hlsl::bitflag thats similar to core::bitflag except without constexpr ?

Else if its not possible/usable, you could move the enums to .hlsl headers, e.g.

namespace nbl { namespace hlsl { enum ShaderStage : uint8_t { ... }; } }

then add a using E_SHADER_STAGE = ShaderStage; alias in place of the original enum

Also I lied to you, there's no uint8_t in HLSL, only uint16_t

But there are more than 8 shader stages anyway....

include/nbl/builtin/hlsl/device_capabilities_traits.hlsl

devshgraphicsprogramming · 2024-01-13T22:47:07Z

include/nbl/builtin/hlsl/device_capabilities_traits.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(allowCommandBufferQueryCopies, bool, false);
+NBL_GENERATE_GET_OR_DEFAULT(maxOptimallyResidentWorkgroupInvocations, uint32_t, 0);
+NBL_GENERATE_GET_OR_DEFAULT(maxResidentInvocations, uint32_t, 0);
+NBL_GENERATE_GET_OR_DEFAULT(spirvVersion, asset::CGLSLCompiler::E_SPIRV_VERSION, 0);


we might need to make that enum global and then alias or something?

I'm not sure what you mean here. Can't we just replace the enum with an integral type like the rest?

this is what I mean
#643 (comment)

devshgraphicsprogramming · 2024-01-13T22:50:31Z

Need a matching modification to nbl/video/CJITIncludeLoader.cpp

Use uint32_t instead of VkExtent2D Use vector types for small arrays Replace core::bitflag<> with uint8_t

…into nahim_shader_defines

include/nbl/builtin/hlsl/device_capabilities_traits.hlsl

AnastaZIuk · 2024-06-19T14:39:18Z

tools/deviceGen/CMakeLists.txt

+# custom command
+set(NBL_COMMAND
+ "${_Python3_EXECUTABLE}"
+ "${NBL_GEN_PY}"


I see 9 writeHeader invocations in tools/deviceGen/gen.py and only 5 arguments for custom command - never hardcode locations like that in your python script, set them as CMake variables (like for those 5) and then let your python receive them!

AnastaZIuk · 2024-06-19T14:59:05Z

tools/deviceGen/gen.py

+
+ limits = loadJSON(args.limits_json_path)
+ writeHeader(
+ args.limits_output_path + "members.h",


your arguments should be simply paths to outputs and not let for some "mangling" thing with concatenation, here it should be just args.limits_output_path, do not hardcode stuff like + "members.h" please otherwise you leave no room for CMake to set locations of your outputs + you need to sync both CMake & your script with your hardcoded data.

make your argument like following

parser.add_argument("--limits-members-filepath", type=str, help="The output file path for SPhysicalDeviceLimits members header")

and then use

args.limits_members_filepath

for writeHeader invocation

AnastaZIuk · 2024-06-19T15:00:27Z

tools/deviceGen/gen.py

+ json=limits
+ )
+ writeHeader(
+ args.limits_output_path + "subset.h",


adjust with following example

AnastaZIuk · 2024-06-19T15:00:30Z

tools/deviceGen/gen.py

+ )
+ features = loadJSON(args.features_json_path)
+ writeHeader(
+ args.features_output_path + "members.h",


adjust with following example

AnastaZIuk · 2024-06-19T15:00:35Z

tools/deviceGen/gen.py

+ json=features
+ )
+ writeHeader(
+ args.features_output_path + "union.h",


adjust with following example

AnastaZIuk · 2024-06-19T15:00:39Z

tools/deviceGen/gen.py

+ op="|"
+ )
+ writeHeader(
+ args.features_output_path + "intersect.h",


adjust with following example

AnastaZIuk · 2024-06-19T15:00:44Z

tools/deviceGen/gen.py

+ op="&"
+ )
+ writeHeader(
+ args.traits_output_path + "testers.hlsl",


adjust with following example

AnastaZIuk · 2024-06-19T15:00:48Z

tools/deviceGen/gen.py

+ format_params=["name"]
+ )
+ writeHeader(
+ args.traits_output_path + "defaults.hlsl",


adjust with following example

AnastaZIuk · 2024-06-19T15:00:52Z

tools/deviceGen/gen.py

+ format_params=["name", "type", "value"]
+ )
+ writeHeader(
+ args.traits_output_path + "members.hlsl",


adjust with following example

AnastaZIuk · 2024-06-19T15:00:56Z

tools/deviceGen/gen.py

+ format_params=["type", "name", "name"]
+ )
+ writeHeader(
+ args.traits_output_path + "jit_members.hlsl",


adjust with following example

devshgraphicsprogramming · 2024-06-19T16:07:56Z

include/nbl/builtin/hlsl/cpp_compat/vector.hlsl

@@ -4,7 +4,7 @@
 // stuff for C++
 #ifndef __HLSL_VERSION 
 #include <stdint.h>
-
+#include <type_traits>


no you can't do that, will lead to circular reference, move your Vector concept to type_traits.hlsl

devshgraphicsprogramming · 2024-06-19T16:12:19Z

include/nbl/builtin/hlsl/cpp_compat/vector.hlsl

@@ -73,6 +73,20 @@ NBL_TYPEDEF_VECTORS(float16_t);
 NBL_TYPEDEF_VECTORS(float32_t);
 NBL_TYPEDEF_VECTORS(float64_t);

+#ifndef __HLSL_VERSION
+template<typename T>
+concept is_vector_v =


concepts are usually named just Camelcase without underscore, i.e. Vector

is_vector_v would be a template variable, i.e.

template<typename T> NBL_CONSTEXPR bool is_vector_v = ...;

Also there's a more sure way to make your trait work with partial specialization, which we ALREADY have!

Nabla/include/nbl/builtin/hlsl/type_traits.hlsl

Line 613 in c55d5a2

struct is_vector : bool_constant<false> {};

So just add

template<typename T> NBL_CONSTEXPR bool is_vector_v = is_vector<T>::value; #ifndef __HLSL_VERSION template<typename T> concept Vector = is_vector_v<T>; #endif

below is_vector

devshgraphicsprogramming · 2024-06-19T16:13:53Z

include/nbl/video/CJITIncludeLoader.h

@@ -8,10 +8,12 @@

 #include <string>

+#include "nbl/builtin/hlsl/type_traits.hlsl"
+
+#include "glm/gtx/string_cast.hpp"


not sure I want this floating around in public headers, can you make the specialization that uses glm::to_string not be inline and have it defined in the .cpp file instead?

devshgraphicsprogramming · 2024-06-19T16:45:19Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(maxFragmentDualSrcAttachments, uint32_t, 1);
+NBL_GENERATE_GET_OR_DEFAULT(maxFragmentCombinedOutputResources, uint32_t, 16);
+NBL_GENERATE_GET_OR_DEFAULT(maxComputeSharedMemorySize, uint32_t, 32768);
+NBL_GENERATE_GET_OR_DEFAULT(maxComputeWorkGroupCount[3], uint32_t, {MinMaxWorkgroupCount,MinMaxWorkgroupCount,MinMaxWorkgroupCount});


you need a different way to handle array/vector constants because NBL_GENERATE_GET_OR_DEFAULT is implemented via inheritance from an integral_constant

it will detect the member

the integral_constant cannot be made with a non-primitive type such as uint32_t3 or uint32_t3

You need to break down vectors/arrays into multiple members with X, Y, Z suffices

but obviously do no changes to the C++ structs

devshgraphicsprogramming · 2024-06-19T16:55:49Z

include/nbl/builtin/hlsl/device_capabilities_traits.hlsl

-NBL_GENERATE_MEMBER_TESTER(fragmentShaderPixelInterlock);
-NBL_GENERATE_MEMBER_TESTER(maxOptimallyResidentWorkgroupInvocations);
-NBL_GENERATE_MEMBER_TESTER(maxResidentInvocations);
+#include "nbl/video/device_capabilities_traits_testers.hlsl"

 #define NBL_GENERATE_GET_OR_DEFAULT(field, ty, default) \
 template<typename S, bool = has_member_##field<S>::value> struct get_or_default_##field : integral_constant<ty,S::field> {}; \


change the bool = has_member_##field<S>::value to has_member_##field<S>::value==(e_member_presence::is_present|e_member_presence::is_static)

Also write a TODO comment that the has_member_ needs to be improved because the detectors should really report present, static AND constant

devshgraphicsprogramming · 2024-06-19T17:08:28Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(maxComputeSharedMemorySize, uint32_t, 32768);
+NBL_GENERATE_GET_OR_DEFAULT(maxComputeWorkGroupCount[3], uint32_t, {MinMaxWorkgroupCount,MinMaxWorkgroupCount,MinMaxWorkgroupCount});
+NBL_GENERATE_GET_OR_DEFAULT(maxComputeWorkGroupInvocations, uint16_t, MinMaxWorkgroupInvocations);
+NBL_GENERATE_GET_OR_DEFAULT(maxWorkgroupSize[3], uint16_t, {MinMaxWorkgroupInvocations,MinMaxWorkgroupInvocations,64u});


see https://github.com/Devsh-Graphics-Programming/Nabla/pull/643/files#r1646508426

devshgraphicsprogramming · 2024-06-19T17:09:32Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(maxSamplerLodBias, float, 4);
+NBL_GENERATE_GET_OR_DEFAULT(maxSamplerAnisotropyLog2, uint8_t, 4);
+NBL_GENERATE_GET_OR_DEFAULT(maxViewports, uint8_t, 16);
+NBL_GENERATE_GET_OR_DEFAULT(maxViewportDims[2], uint16_t, {MinMaxImageDimension2D,MinMaxImageDimension2D});


see https://github.com/Devsh-Graphics-Programming/Nabla/pull/643/files#r1646508426

devshgraphicsprogramming · 2024-06-19T17:10:01Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(maxSamplerAnisotropyLog2, uint8_t, 4);
+NBL_GENERATE_GET_OR_DEFAULT(maxViewports, uint8_t, 16);
+NBL_GENERATE_GET_OR_DEFAULT(maxViewportDims[2], uint16_t, {MinMaxImageDimension2D,MinMaxImageDimension2D});
+NBL_GENERATE_GET_OR_DEFAULT(viewportBoundsRange[2], float, { -MinMaxImageDimension2D*2u, MinMaxImageDimension2D*2u-1 });


see https://github.com/Devsh-Graphics-Programming/Nabla/pull/643/files#r1646508426

BUT anything with Range in the name, needs Min and Max suffices instead of X and Y

devshgraphicsprogramming · 2024-06-19T17:11:16Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(maxFramebufferWidth, uint32_t, MinMaxImageDimension2D);
+NBL_GENERATE_GET_OR_DEFAULT(maxFramebufferHeight, uint32_t, MinMaxImageDimension2D);
+NBL_GENERATE_GET_OR_DEFAULT(maxFramebufferLayers, uint32_t, 1024);
+NBL_GENERATE_GET_OR_DEFAULT(maxColorAttachments, uint8_t, MinMaxColorAttachments);


in HLSL you don't have access to the constexprs used in C++ structs.

so substitute the values here for HLSL

devshgraphicsprogramming · 2024-06-19T17:11:27Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(maxImageDimension1D, uint32_t, MinMaxImageDimension2D);
+NBL_GENERATE_GET_OR_DEFAULT(maxImageDimension2D, uint32_t, MinMaxImageDimension2D);
+NBL_GENERATE_GET_OR_DEFAULT(maxImageDimension3D, uint32_t, 2048);
+NBL_GENERATE_GET_OR_DEFAULT(maxImageDimensionCube, uint32_t, MinMaxImageDimension2D);


https://github.com/Devsh-Graphics-Programming/Nabla/pull/643/files#r1646527909

devshgraphicsprogramming · 2024-06-19T17:11:32Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(maxImageArrayLayers, uint32_t, 2048);
+NBL_GENERATE_GET_OR_DEFAULT(maxBufferViewTexels, uint32_t, 33554432);
+NBL_GENERATE_GET_OR_DEFAULT(maxUBOSize, uint32_t, 65536);
+NBL_GENERATE_GET_OR_DEFAULT(maxSSBOSize, uint32_t, MinMaxSSBOSize);


https://github.com/Devsh-Graphics-Programming/Nabla/pull/643/files#r1646527909

devshgraphicsprogramming · 2024-06-19T17:11:48Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(subPixelInterpolationOffsetBits, uint8_t, MinSubPixelInterpolationOffsetBits);
+NBL_GENERATE_GET_OR_DEFAULT(maxFramebufferWidth, uint32_t, MinMaxImageDimension2D);
+NBL_GENERATE_GET_OR_DEFAULT(maxFramebufferHeight, uint32_t, MinMaxImageDimension2D);


https://github.com/Devsh-Graphics-Programming/Nabla/pull/643/files#r1646527909

devshgraphicsprogramming · 2024-06-19T17:12:19Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(pointSizeRange[2], float, {1.f,64.f});
+NBL_GENERATE_GET_OR_DEFAULT(lineWidthRange[2], float, {1.f,1.f});


same as https://github.com/Devsh-Graphics-Programming/Nabla/pull/643/files#r1646526985

devshgraphicsprogramming · 2024-06-19T17:23:52Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(nonCoherentAtomSize, uint16_t, 256);
+// VK 1.1
+NBL_GENERATE_GET_OR_DEFAULT(subgroupSize, uint16_t, 4);
+NBL_GENERATE_GET_OR_DEFAULT(subgroupOpsShaderStages, core::bitflag<asset::IShader::E_SHADER_STAGE>, asset::IShader::ESS_COMPUTE | asset::IShader::ESS_ALL_GRAPHICS);


we don't have core::bitflag in HLSL and we won't be porting it to HLSL.

So whenever you encounter an enum use the T from core::bitflag<T> instead.

Btw you might need to bump DXC a version for the enum to integer casts, etc. to compile

You won't need to bump the DXC version, but the enum in HLSL cannot be enum class unfortunately due to this

Enum Class not constexpr: https://godbolt.org/z/hheGKo9vx

unscoped enum is constexpr: https://godbolt.org/z/7rorK5qoW

devshgraphicsprogramming · 2024-06-19T17:37:48Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(shaderSubgroupArithmetic, bool, false);
+NBL_GENERATE_GET_OR_DEFAULT(shaderSubgroupQuad, bool, false);
+NBL_GENERATE_GET_OR_DEFAULT(shaderSubgroupQuadAllStages, bool, false);
+NBL_GENERATE_GET_OR_DEFAULT(pointClippingBehavior, E_POINT_CLIPPING_BEHAVIOR, EPCB_USER_CLIP_PLANES_ONLY);


https://github.com/Devsh-Graphics-Programming/Nabla/pull/643/files#r1646543284

devshgraphicsprogramming · 2024-06-19T17:38:04Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(maxMultiviewViewCount, uint8_t, 6);
+NBL_GENERATE_GET_OR_DEFAULT(maxMultiviewInstanceIndex, uint32_t, 134217727);
+NBL_GENERATE_GET_OR_DEFAULT(maxPerSetDescriptors, uint32_t, 572);
+NBL_GENERATE_GET_OR_DEFAULT(maxMemoryAllocationSize, size_t, MinMaxSSBOSize);


https://github.com/Devsh-Graphics-Programming/Nabla/pull/643/files#r1646527909

devshgraphicsprogramming · 2024-06-19T17:40:54Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(maxDescriptorSetUpdateAfterBindDynamicOffsetSSBOs, uint32_t, 4);
+NBL_GENERATE_GET_OR_DEFAULT(maxDescriptorSetUpdateAfterBindImages, uint32_t, 500000);
+NBL_GENERATE_GET_OR_DEFAULT(maxDescriptorSetUpdateAfterBindStorageImages, uint32_t, 500000);
+NBL_GENERATE_GET_OR_DEFAULT(maxDescriptorSetUpdateAfterBindInputAttachments, uint32_t, MinMaxColorAttachments);


https://github.com/Devsh-Graphics-Programming/Nabla/pull/643/files#r1646527909

devshgraphicsprogramming · 2024-06-19T17:42:26Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(supportedDepthResolveModes, core::bitflag<RESOLVE_MODE_FLAGS>, RESOLVE_MODE_FLAGS::SAMPLE_ZERO_BIT);
+NBL_GENERATE_GET_OR_DEFAULT(supportedStencilResolveModes, core::bitflag<RESOLVE_MODE_FLAGS>, RESOLVE_MODE_FLAGS::SAMPLE_ZERO_BIT);


remember about using the enums directly, migrating them to HLSL header and having them unscoped

devshgraphicsprogramming · 2024-06-19T17:45:51Z

tools/deviceGen/output/traits_defaults.hlsl

+NBL_GENERATE_GET_OR_DEFAULT(storageTexelBufferOffsetAlignmentBytes, size_t, std::numeric_limits<size_t>::max());
+NBL_GENERATE_GET_OR_DEFAULT(uniformTexelBufferOffsetAlignmentBytes, size_t, std::numeric_limits<size_t>::max());
+NBL_GENERATE_GET_OR_DEFAULT(maxBufferSize, size_t, MinMaxSSBOSize);


size_t isn;t a thing in HLSL https://godbolt.org/z/8nnfnToqs

use uint64_t instead

add shader defines

7aaf269

devshgraphicsprogramming commented Jan 13, 2024

View reviewed changes

include/nbl/builtin/hlsl/device_capabilities_traits.hlsl Outdated Show resolved Hide resolved