Dynamic shapes (v1) #1494

krzysz00 · 2024-04-25T16:23:29Z

krzysz00
Apr 25, 2024
Maintainer

[[ these are an initial draft that was edited during a meeting ]]

Places we'd need to account for dynamic shapes

TransformAttr, TransformMapAttr

The good news is that affine maps let us have "symbols", which act
like constants for analysis purposes but are bound at runtime. We
should be able to reuse this mechanism for transform maps.

In an individual coordinate transform (the TransformAttr), a symbol
can appear in the parameter lists to transformations (like Merge,
Unmerge, or Pad). They can't appear in the dimension list.

On a technical level, this means that the params component of a
TransformAttr will shift from ArrayRef<int64_t> to same flavor of
ArrayRef<ConstantOrSymbolIndex>. Those'll be used when constructing
the affine map.

TransformMapAttr and its builders will also need a few extensions.
Firstly, in the computation of lower or upper bounds, we'll follow
the tensor and memref convention of using -1 (printed as ?) for
bounds of unknown size.

Because, for example, we may need to add symbols when dealing in
broadcasts, it may be a good idea to keep numSymbols as an
attribute on TransformMapAttrs and their builders, so as to ensure
uniformity within a transform stack. However, I'm not entirely sure
that this'll be required.

My PadTo{} is actually Align{}
(ex. Align{64})

Transform map validity (Pad{} and Embed{})

The current Pad{l1, r1, l2, r2, ...} operator in transform maps
comes from convolutions. Its validity check rests on being able
to pull a lower bound from the transform map to see if the
input is in the padded region. We also have similar checks to
prevent the results of Embed{}s that go negative or out of
bounds from being propagated further.

The first problem we have is that Pad{} is specified in terms of
how much padding to add to the left and right of the given dimension.
When padding a dynamic dimension out to a block size, we fundamentally
don't know those qualities. We colud compute them and make that a
symbol, but that'll get irritating, lead to a proliferation of symbols,
and generally be a pain ... but it's not off the table.

Therefore, I propose PadTo{N1, N2, ...}, where the N_k are block sizes
or other such constants. The semantics of PadTo{} is that the k'th lower
dimension is padded out to the next multiple of N_k. We can then impose
the restriction that N_k be a constant, and make a similer requirement
on the inputs to Pad{}, just to stay sane.

However, the higher-level problem of validity checking remains.
I claim that there's a reasonably straightforward way to fix this, namely
defining
OpFoldResult getLengthOfDim(unsigned dim, ArrayRef<TransformMapAttr> remainingMaps, Value underlyingBuffer);
which returns the length dim can have as in input to remainingMaps
before reaching underlyingMem - if the dimension isn't static there,
this'll eventually lead to a call to tensor/memref.dim or by some
other similar construct.

The method in question will only really be needed during validity checks.

`rock.transform` and `rock.transforming_for`

There're places in our code where we could end up computing
dimension lengths from other lengths - ex

... Ok, no, this needs contemplated overall, who/where the size
expressions go is an open question.

... Darn tempted to move it all off of transforming_for and
make that thing just take a transformed Value and some initial
coordinates.

Sizes, v2

So, since Embed is a general linear combination with user-provided size
semantics, we may want to move to storing sizes in general.
My proposal here is

rock.transform #transform_map(%val)
  symbols [%...] // for Merge{Symbol{0}, 3, 3}, ex. 
  // in_dims [%dynDim1, %dyndim2, ...] out_dims [%dynDim1]
  // not needed per below  
  : memref<2x?xf32> to memref<2x?x?xf32>

Or (per Simon):
Embed is changed to take extra parameters
Embed{lowerLen, coeff1, size1, coeff2, size2, ...}
and those are symbols.
Now size computation works and we just need symbols.

Action, me: check Broadcast

Example:
Embed{1, 1} ["y", "ho"] -> ["hi"]
len(y) = dim(%filter[y])
len(ho) = dim(%output[h])
len(hi) = dim(%input[h]) // assume no padding on input

gemmN <- Merge("no", "ho", "wo")
gemmNPadded <- PadTo{BlockSize}(gemmN) // must check validity here
// so must know size of gemmN, so must know size of ho
// but size of ho isn't funciton of size of hi

and that rock.transforming_for just takes a pointer to the transform stack.

// New API:
ArrayRefrock::TransformOp, Value, ... untransform(Value transformed);
optional<ArrayRef, ...>staticUntransform(Value transformed);

rock.transforming for (%lower1, ...) = %transformedValue[%coord0, %coord1],

Computing launch sizes

Good news: block size computations proceed as normal, since
those are functions of the tuning parameters.

Bad news: grid size is a function of dynamic inputs!

So we don't actually get a known grid_size.

I suggest that we allow the grid size to be an affine expression
of the input dimensions and attach that to the function.

Per IREE, this can just be an @func reference to call at runtime
Simon: get docs on IREE interface in case we want to follow it.

blockwise_fill

Will need to be adjusted to assume the blockwise-ness or
otherwise to not rely on getNumElements() working in the validation.

The utility kernels in general

These kernels will, in the dynamic case, end up with a non-constant
trip count in their outer loops. This is fine.

Vectorization

Easy answer: a dynamic dimension's vector length and alignment
requirement are 1. Having non-static-ness, just like padding by 1,
makes for unvectorizability.

Index size / the buffer trick

The various "can this be 64-bit indexing" queries will have to returns
"yes" when there's a dynamic shape involved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic shapes (v1) #1494

{{title}}

Replies: 0 comments

Select a reply

Dynamic shapes (v1) #1494

krzysz00 Apr 25, 2024 Maintainer

Places we'd need to account for dynamic shapes

TransformAttr, TransformMapAttr

Transform map validity (Pad{} and Embed{})

rock.transform and rock.transforming_for

Sizes, v2

Computing launch sizes

blockwise_fill

The utility kernels in general

Vectorization

Index size / the buffer trick

Replies: 0 comments

krzysz00
Apr 25, 2024
Maintainer

`rock.transform` and `rock.transforming_for`