Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute index type by bounding index expressions #3601

Open
jacobhinkle opened this issue Dec 17, 2024 · 1 comment
Open

Compute index type by bounding index expressions #3601

jacobhinkle opened this issue Dec 17, 2024 · 1 comment
Labels

Comments

@jacobhinkle
Copy link
Collaborator

In #3595 we are seeing that large matmul problems use int64_t indexing even if all the global memory transfers are done using TMA instead of vectorized accesses. Since TMA can use 2D indexing in these cases, most of the time it is safe to actually use int32_t indexing in these cases.

Currently we compute the index type by looking at all the input tensors in a KernelArgumentHolder and finding the largest index that could be used to index an element of any of those tensors. Instead, what we would ideally like is to bound each expression in our lowered kernel and if all of those bounds is within the range of an int32_t, use Int32 as the index type. To do this we could implement some limited interval arithmetic on Val* and evaluate bounds for all scalars in the kernel, stopping when an upper bound indicates overflow.

@jacobhinkle
Copy link
Collaborator Author

This type of analysis could also allow us to mix index types within the kernel by setting the dtype to Int32 for tensors that we bound below the overflow threshold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant