Move tensor_set/scale/cast/copy to gtest #3416

CAHEK7 · 2024-12-01T20:06:07Z

This PR moves tensor_set/scale/cast/copy to gtest framework.
It introduces fast and flexible cpu verification routines for unary/binary/ternary operations (@Vsevolod1983 please pay attention to this, it's very useful for ternary tensor_ops conversion)
Those routines are up to 4x faster for unary operations, 3xfaster for binary and almost 2x faster for ternary operations (probably more, it becomes better for bigger tensors, I checked only those sizes what are in the testsuites).

It introduces more size_t friendly configs for #3393 (@Vsevolod1983 please take a look)

It also fixes few obvious bugs in the Cast operations.

unary: 2.1-3.8x binary: 1.8-2.7x ternary: 1.3-1.7x

CAHEK7 · 2024-12-04T17:03:53Z

This PR blocks #3424

Vsevolod1983 · 2024-12-04T18:21:15Z

src/ocl/tensorocl.cpp

@@ -2135,8 +2135,10 @@ void CastTensor(const Handle& handle,
        MIOPEN_THROW(miopenStatusBadParm, "Tensor dimension sizes unsupported.");
    }

+    auto miopen_alpha = *(static_cast<const float*>(alpha));


Is it possible alpha to be nullptr here ? Or I missed this check ?

It looks like it's set everywhere it's used, but also surprising we pass it as a pointer.
It seems like it could be a float in this case.

I just moved it earlier because previously it was even worse - it did CopyTensor for any alpha and since I extended the coverage, I've found out the case and fixed it.
You are right, but I have no time for another review round) Probably I can make some subsequent PR related to that nullptr check (and I8 Copy case).

It seems like it could be a float in this case.

For all the types except double it must be float and for the double it must be double, but MIOpen barely supports doubles, almost everywhere... Anyway, it can be done later.

Btw, that's why I asked to add explicit double check for all the isApplicable methods here https://github.com/ROCm/MIOpen/pull/3346/files#diff-1af35db437d55c5d90c9186b7db9736706d96549c0f20ea3156113db8fdc16e4

Vsevolod1983 · 2024-12-04T18:43:23Z

test/conv2d_bias.cpp

@@ -35,7 +35,7 @@ struct conv2d_bias_driver : public conv_bias_driver<T>
                   tensor_elem_gen_checkboard_sign{}(is...);
        };

-        this->add(this->output, "output", this->get_tensor(get_inputs, gen_value));
+        this->add(this->output, "output", this->get_tensor(get_inputs<int>, gen_value));


Does this change relate to the purpose of the given PR?

Yep, it won't compile because it can't inference a type even if there is default one.

test/gtest/binary_tensor_ops.cpp

BrianHarrisonAMD · 2024-12-04T19:30:40Z

test/gtest/binary_tensor_ops.cpp

+        if(clamp)
+        {
+            operate_over_subtensor(
+                [alpha, clampVal = static_cast<float>(std::numeric_limits<DstType>::max())](
+                    auto& dst, auto src) {
+                    dst = std::min(static_cast<float>(src) * alpha, clampVal);
+                },
+                dstSuperCpu,
+                srcSuperCpu,
+                dstDesc,
+                srcDesc,
+                dstOffset,
+                srcOffset);
+        }
+        else
+        {
+            operate_over_subtensor(
+                [alpha](auto& dst, auto src) { dst = static_cast<float>(src) * alpha; },
+                dstSuperCpu,
+                srcSuperCpu,
+                dstDesc,
+                srcDesc,
+                dstOffset,
+                srcOffset);
+        }


Looks like we gained coverage here.
I think the old tests only checked the clamp case.

Nice!

BrianHarrisonAMD · 2024-12-04T19:38:08Z

src/ocl/tensorocl.cpp

-        std::string network_config = "cast " + std::to_string(dstDesc_flat.GetType());
+        // TODO: make proper network config
+        std::string network_config = "cast " + std::to_string(srcDesc_flat.GetType()) +
+                                     std::to_string(dstDesc_flat.GetType());


Was this wrong before?

Looks like it used to have a caching key based off the dst type only, but uses the src type as part of the compilation params.

Before - it did not use srcType for the network config and failed if you do call int -> float and then float -> int conversion (nobody tested mixed calls ever and we had that problem in the other algorithms).
In the both cases it called int -> float because it has been added first. Now it calls appropriate kernels

Btw - that's one of the "gtest single binary" problems - we may stuck withing the first added kernel.

Proper fix is to include all the compile-time and performance critical parameters into the network config.
In the first case it will choose a properly compiled kernel and in the second one it will choose the most performant one (if there are any). I haven't analyzed it deep and just fixed that part with has been found we the wider coverage.

BrianHarrisonAMD · 2024-12-04T20:08:59Z

test/gtest/binary_tensor_ops.cpp

+X_INSTANTIATE_COPY(FP32, float);
+X_INSTANTIATE_COPY(FP16, float16);
+X_INSTANTIATE_COPY(BFP16, bfloat16);


Why is there no I8 tests for Copy?

It is forgotten.

BrianHarrisonAMD

Looks good to me.
I think this expands the coverage in our CI, and my comments are not blocking.

CAHEK7 added 4 commits December 1, 2024 21:56

Improve 'size_t' support for the tests (related to #3393)

0d605e0

move tensor_scale/set tests to gtest

d92af82

add binary and ternary subtensor operations and speed everything up

588bbb7

unary: 2.1-3.8x binary: 1.8-2.7x ternary: 1.3-1.7x

move tenosor_cast/_copy tests to gtest, fix obvious bugs

1c6b6df

CAHEK7 added bug enhancement quality testing complexity_middle GTest labels Dec 1, 2024

CAHEK7 requested a review from Vsevolod1983 December 1, 2024 20:06

CAHEK7 requested review from BrianHarrisonAMD, junliume and BradPepersAMD as code owners December 1, 2024 20:06

CAHEK7 added 5 commits December 1, 2024 22:28

use lambdas insteal of functor structs

84db363

replace inefficient loop

3fd23e3

fix compilation error

3505f20

remove removed tests

7e7177c

fix build

74e5eac

CAHEK7 force-pushed the C7/simple_tensorops_gtest branch from a4ffd88 to 4af51c7 Compare December 2, 2024 13:41

fix test names

a66714d

CAHEK7 force-pushed the C7/simple_tensorops_gtest branch from 4af51c7 to a66714d Compare December 2, 2024 14:03

clang-format

585ee05

CAHEK7 force-pushed the C7/simple_tensorops_gtest branch from 44aa747 to 585ee05 Compare December 2, 2024 15:33

CAHEK7 added 2 commits December 3, 2024 16:40

fix tensor initialization

70a6c9e

fix msvc macro expansion

8e9dfda

CAHEK7 added the TESTING_CI_PASSED label Dec 4, 2024

CAHEK7 added the urgency_blocker label Dec 4, 2024

CAHEK7 mentioned this pull request Dec 4, 2024

[GTests] Convert TensorOps CTest to GTest #3424

Merged

Vsevolod1983 reviewed Dec 4, 2024

View reviewed changes

BrianHarrisonAMD reviewed Dec 4, 2024

View reviewed changes

test/gtest/binary_tensor_ops.cpp Show resolved Hide resolved

BrianHarrisonAMD reviewed Dec 4, 2024

View reviewed changes

BrianHarrisonAMD approved these changes Dec 4, 2024

View reviewed changes

Vsevolod1983 approved these changes Dec 5, 2024

View reviewed changes

BrianHarrisonAMD merged commit a325bdb into develop Dec 5, 2024
30 of 144 checks passed

BrianHarrisonAMD deleted the C7/simple_tensorops_gtest branch December 5, 2024 14:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move tensor_set/scale/cast/copy to gtest #3416

Move tensor_set/scale/cast/copy to gtest #3416

CAHEK7 commented Dec 1, 2024

CAHEK7 commented Dec 4, 2024

Vsevolod1983 Dec 4, 2024

BrianHarrisonAMD Dec 4, 2024 •

edited

Loading

CAHEK7 Dec 4, 2024 •

edited

Loading

Vsevolod1983 Dec 4, 2024

CAHEK7 Dec 4, 2024

BrianHarrisonAMD Dec 4, 2024

BrianHarrisonAMD Dec 4, 2024 •

edited

Loading

CAHEK7 Dec 4, 2024 •

edited

Loading

CAHEK7 Dec 4, 2024

CAHEK7 Dec 4, 2024

BrianHarrisonAMD Dec 4, 2024 •

edited

Loading

CAHEK7 Dec 4, 2024

BrianHarrisonAMD left a comment

Move tensor_set/scale/cast/copy to gtest #3416

Move tensor_set/scale/cast/copy to gtest #3416

Conversation

CAHEK7 commented Dec 1, 2024

CAHEK7 commented Dec 4, 2024

Vsevolod1983 Dec 4, 2024

Choose a reason for hiding this comment

BrianHarrisonAMD Dec 4, 2024 • edited Loading

Choose a reason for hiding this comment

CAHEK7 Dec 4, 2024 • edited Loading

Choose a reason for hiding this comment

Vsevolod1983 Dec 4, 2024

Choose a reason for hiding this comment

CAHEK7 Dec 4, 2024

Choose a reason for hiding this comment

BrianHarrisonAMD Dec 4, 2024

Choose a reason for hiding this comment

BrianHarrisonAMD Dec 4, 2024 • edited Loading

Choose a reason for hiding this comment

CAHEK7 Dec 4, 2024 • edited Loading

Choose a reason for hiding this comment

CAHEK7 Dec 4, 2024

Choose a reason for hiding this comment

CAHEK7 Dec 4, 2024

Choose a reason for hiding this comment

BrianHarrisonAMD Dec 4, 2024 • edited Loading

Choose a reason for hiding this comment

CAHEK7 Dec 4, 2024

Choose a reason for hiding this comment

BrianHarrisonAMD left a comment

Choose a reason for hiding this comment

BrianHarrisonAMD Dec 4, 2024 •

edited

Loading

CAHEK7 Dec 4, 2024 •

edited

Loading

BrianHarrisonAMD Dec 4, 2024 •

edited

Loading

CAHEK7 Dec 4, 2024 •

edited

Loading

BrianHarrisonAMD Dec 4, 2024 •

edited

Loading