Compute index type by bounding index expressions #3601

jacobhinkle · 2024-12-17T14:50:32Z

In #3595 we are seeing that large matmul problems use int64_t indexing even if all the global memory transfers are done using TMA instead of vectorized accesses. Since TMA can use 2D indexing in these cases, most of the time it is safe to actually use int32_t indexing in these cases.

Currently we compute the index type by looking at all the input tensors in a KernelArgumentHolder and finding the largest index that could be used to index an element of any of those tensors. Instead, what we would ideally like is to bound each expression in our lowered kernel and if all of those bounds is within the range of an int32_t, use Int32 as the index type. To do this we could implement some limited interval arithmetic on Val* and evaluate bounds for all scalars in the kernel, stopping when an upper bound indicates overflow.

The text was updated successfully, but these errors were encountered:

jacobhinkle · 2024-12-17T15:11:43Z

This type of analysis could also allow us to mix index types within the kernel by setting the dtype to Int32 for tensors that we bound below the overflow threshold.

jacobhinkle added the TMA label Dec 17, 2024

jacobhinkle mentioned this issue Dec 17, 2024

Support TMA with Int64 indexing #3595

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute index type by bounding index expressions #3601

Compute index type by bounding index expressions #3601

jacobhinkle commented Dec 17, 2024

jacobhinkle commented Dec 17, 2024

Compute index type by bounding index expressions #3601

Compute index type by bounding index expressions #3601

Comments

jacobhinkle commented Dec 17, 2024

jacobhinkle commented Dec 17, 2024