Fine-tune graph capture and device code testing #2245
-
Most tests are repeated three times for host, device, and graph capture. This effectively triples the workload (compile-time and execution). In my opinion, device and graph capture tests cover cases that are often already covered by host tests, and in many cases don't improve coverage. Device and graph capture tests should be orthogonal and evaluate all code paths that make sense. One example for all, does it make sense to test a routine with num. items > 2^31-1 for device and graph capture when the same code path is already evaluated by host? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
I agree, it makes sense to keep extensive testing for host usage and "light" version for device and graph capture. The device and graph capture tests are usually about testing for successful compilation. So, the requirement for "light" testing of device and graph capture would be covering reasonably wide set of compile-time workloads. For the mentioned case of items > 2^31 - 1, I'd suggest to have one device and graph capture test for 64-bit offset type, but runtime value of |
Beta Was this translation helpful? Give feedback.
I agree, it makes sense to keep extensive testing for host usage and "light" version for device and graph capture. The device and graph capture tests are usually about testing for successful compilation. So, the requirement for "light" testing of device and graph capture would be covering reasonably wide set of compile-time workloads. For the mentioned case of items > 2^31 - 1, I'd suggest to have one device and graph capture test for 64-bit offset type, but runtime value of
num_items
doesn't have to exceed 2^31 - 1 because I'd estimate the chances of having execution-space-specific code paths affecting runtime as approaching zero. On a technical side, we can follow some existing tests an…