Possible debug info bug with matrix bitpieces #7044

baldurk · 2024-12-19T17:02:54Z

Description
The debug info uses bitpieces to map from matrices to elements but it uses a column-major conventions in conflict with how the matrix template members are defined.

The first convention is on the members - _11, _12 etc which are treated as row-major and have bit offsets with the first-row first-column element to be at bit offset 0, the first-row second-column element at bit offset 32, etc up to the last element at bit offset 480. This is what I'd expect for HLSL conventions.

The bitpieces themselves when mapping individual scalar SSAs seem to use a column-major convention, so so the first-column second-row element is mapped to bit offset 32.

Steps to Reproduce
I initially saw this on the ModelViewer sample in DirectX-Graphics-Samples if you want a real-world test case, but I've simplified it down to a synthetic test compiled with -Zi -Od: https://godbolt.org/z/K8jK8fj6j

The main function is simple and it just launders a vector through a matrix to test indexing:

float4 main(float3 test : TEST) : SV_Target0
{
    float4x4 mymat = float4x4(
        0, 4, 0, test.x,
        5, 0, 0, test.y,
        0, 0, 0, test.z,
        0, 0, 0, 1
        );

    float4 ret = 0.0f.xxxx;

    ret.x = mymat._14; // should be test.x
    ret.y = mymat._24; // should be test.y
    ret.z = mymat._34; // should be test.z

    return ret;
}

The actual executable part is what I'd expect (elided):

  %1 = call float @dx.op.loadInput.f32(i32 4, i32 0, i32 0, i8 0, i32 undef), !dbg !73 ; line:3 col:20  ; LoadInput(inputSigId,rowIndex,colIndex,gsVertexAxis)
  %2 = call float @dx.op.loadInput.f32(i32 4, i32 0, i32 0, i8 1, i32 undef), !dbg !73 ; line:3 col:20  ; LoadInput(inputSigId,rowIndex,colIndex,gsVertexAxis)
  %3 = call float @dx.op.loadInput.f32(i32 4, i32 0, i32 0, i8 2, i32 undef), !dbg !73 ; line:3 col:20  ; LoadInput(inputSigId,rowIndex,colIndex,gsVertexAxis)
  call void @dx.op.storeOutput.f32(i32 5, i32 0, i32 0, i8 0, float %1), !dbg !99 ; line:18 col:5  ; StoreOutput(outputSigId,rowIndex,colIndex,value)
  call void @dx.op.storeOutput.f32(i32 5, i32 0, i32 0, i8 1, float %2), !dbg !99 ; line:18 col:5  ; StoreOutput(outputSigId,rowIndex,colIndex,value)
  call void @dx.op.storeOutput.f32(i32 5, i32 0, i32 0, i8 2, float %3), !dbg !99 ; line:18 col:5  ; StoreOutput(outputSigId,rowIndex,colIndex,value)
  call void @dx.op.storeOutput.f32(i32 5, i32 0, i32 0, i8 3, float 0.000000e+00), !dbg !99 ; line:18 col:5  ; StoreOutput(outputSigId,rowIndex,colIndex,value)
  ret void, !dbg !99 ; line:18 col:5

So the _14, _24, _34 indices correspond to the first, second and third row in the fourth column.

Looking at the definition of the members you can see that this is bit-packing those row-major:

!11 = !DIDerivedType(tag: DW_TAG_member, name: "_14", scope: !5, file: !1, line: 5, baseType: !8, size: 32, align: 32, offset: 96, flags: DIFlagPublic)
!15 = !DIDerivedType(tag: DW_TAG_member, name: "_24", scope: !5, file: !1, line: 5, baseType: !8, size: 32, align: 32, offset: 224, flags: DIFlagPublic)
!19 = !DIDerivedType(tag: DW_TAG_member, name: "_34", scope: !5, file: !1, line: 5, baseType: !8, size: 32, align: 32, offset: 352, flags: DIFlagPublic)

However look at the way the test elements are mentioned as members:

  call void @llvm.dbg.value(metadata float %1, i64 0, metadata !78, metadata !88), !dbg !80 ; var:"mymat" !DIExpression(DW_OP_bit_piece, 384, 32) func:"main"
  call void @llvm.dbg.value(metadata float %2, i64 0, metadata !78, metadata !89), !dbg !80 ; var:"mymat" !DIExpression(DW_OP_bit_piece, 416, 32) func:"main"
  call void @llvm.dbg.value(metadata float %3, i64 0, metadata !78, metadata !90), !dbg !80 ; var:"mymat" !DIExpression(DW_OP_bit_piece, 448, 32) func:"main"

These are contiguous in the next-to-final elements of the matrix - if this were row-major that would make them the first elements of the final row.

I also added some constants so you can see the first-row second-column value of 4 and the second-row first-column value of 5, which are laid out column-wise:

  call void @llvm.dbg.value(metadata float 4.000000e+00, i64 0, metadata !78, metadata !79), !dbg !80 ; var:"mymat" !DIExpression(DW_OP_bit_piece, 128, 32) func:"main"
  call void @llvm.dbg.value(metadata float 5.000000e+00, i64 0, metadata !78, metadata !76), !dbg !80 ; var:"mymat" !DIExpression(DW_OP_bit_piece, 32, 32) func:"main"

Actual Behavior
The bitpieces identifying matrix elements are calculated column-wise while the members are row-wise. Either would work, but it's currently inconsistent and the bitpieces are not what I'd expect matching HLSL conventions.

The result of this is any program parsing the debug info that either hardcodes a row-major convention or even does something smart to reflect the layout out of the matrix members by _11, _12 offsets etc will end up transposing the matrix.

Environment

DXC version: trunk on Compiler Explorer
Host Operating System: Compiler Explorer / windows 10

The text was updated successfully, but these errors were encountered:

tex3d · 2024-12-20T01:56:51Z

This bug results from an unfortunate implementation detail where matrix orientations are respected for all matrix declarations, not just used for determining the storage layout for host-visible or groupshared memory. This combined with the fact that there is only one matrix type for row_major and column_major orientations, and the fact that the logical layout in HLSL is always row_major, means that the type layout does not match the dbg.value layout for a column_major matrix. Since column_major is the default, you have to override that one way or another to get a congruent layout between the type's debug info and the dbg.value bit_piece locations.

The easiest workaround is likely using -Zpr to override the default layout globally, then use column_major for each matrix declaration or field used for cbuffer/StructuredBuffer/etc... when column major is desired.

Note that this layout qualifier is dropped from StructuredBuffer template arguments and the ByteAddressBuffer Load method template arguments, so wrapping a matrix field in a struct type is necessary for these cases.

baldurk · 2024-12-20T10:33:29Z

To test I added -Zpr to the godbolt example above and the matrix disappeared entirely from the debug info, which doesn't seem good either - depending on whether that happens for a less trivial example where you'd actually want to see what the matrix contains.

Also the problem is that a tool processing the debug information obviously can't dictate compilation settings to applications or change the shaders that it's debugging. Is there any way to detect this bug and work around it?

baldurk added bug Bug, regression, crash needs-triage Awaiting triage labels Dec 19, 2024

github-project-automation bot added this to HLSL Triage Dec 19, 2024

llvm-beanz added matrix-bug Bugs relating to matrix types debug info Related to debug info generation and removed needs-triage Awaiting triage labels Dec 19, 2024

llvm-beanz added this to the Dormant milestone Dec 19, 2024

llvm-beanz moved this to Triaged in HLSL Triage Dec 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible debug info bug with matrix bitpieces #7044

Possible debug info bug with matrix bitpieces #7044

baldurk commented Dec 19, 2024

tex3d commented Dec 20, 2024

baldurk commented Dec 20, 2024

Possible debug info bug with matrix bitpieces #7044

Possible debug info bug with matrix bitpieces #7044

Comments

baldurk commented Dec 19, 2024

tex3d commented Dec 20, 2024

baldurk commented Dec 20, 2024