-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AssertionError: Dynamic dims not currently supported in tensor_slice function of IREE #268
Comments
@ammarhd Not really related to your issue but did you run into this error?
Also you might want to try
|
As the error message states, dynamic dims is currently not supported as support was removed with commit 97e0517. The backing APIs have been deprecated for a year and |
If the support for dynamic dims is removed, how is one supposed to provide dynamic input to an LLM? Sorry if this is a dumb question but would appreciate any pointers or already written examples. |
This comment is not correct. Support for the pytorch pre-release API for specifying dynamic shapes was removed because pytorch removed it. It had been deprecated for a long time with the recommendation to support the official API. See pytorch's documentation: https://pytorch.org/docs/stable/export.html#expressing-dynamism This is what we support. |
We don't have a replacement mechanism for specifying dynamic shapes across turbine functions except to use old versions. It is likely that stateless_llama can be reworked to the new API, but I havent looked at it for a long time since we have been using a more direct approach to torch.export llm models. |
Can you share any relevant examples? |
When turbine was first written, the export path did not support mutable arguments or buffers, and this required the meta programming Jack that stateless llama did to tie kv cache state to a global variable. It had a number of issues, not the least of which was that it used a progression of bigger-by-1 intermediate buffers to get across the torch barrier (this kills caching allocation schemes and costs a lot of perf). It also made unnecessary copies. The new work that my group does uses mutable function arguments to pass a fixed size kv cache in for in place operation. I've also been told of another group who does not have their code accessible that is using torch module level buffers to the same effect. Our canonical example is still pre release and more complicated than it should be but it's here: https://github.com/nod-ai/shark-ai/blob/main/sharktank/sharktank/examples/export_paged_llm_v1.py Note that this uses a paged kv cache by default, which requires a much more complicated inference sequence intended for serving. The flags in there for "direct" cache approximate what stateless llama was doing with a single, linear cache. For even simpler use (and what I think the other group did), you can just create a torch wrapper module that registers a buffer for the kv cache and then slices it to an expected size of interfacing to something like a transformers model. We don't have a public examine of that afaik. |
I encountered a similar issue. Could you share your final solution? |
we actually have had this working and it was still working up till last week stateless_llama.py it was working using shark_turbine and all of a sudden we start getting an issue and unfortunately we had some libraries not locked with specific version and I guess there few updates released last week and it did not work anymore .... I tried to use older versions of iree-turbine and torch but could not make it work any more.
now we are getting the following error
... Traceback (most recent call last): File "/app/app/services/../pipelines/llama_2_pipeline/run_pipeline.py", line 92, in slice_up_to_step\n sliced = IREE.tensor_slice(\n ...
... File "/usr/local/lib/python3.10/dist-packages/iree/turbine/aot/support/procedural/iree_emitter.py", line 285, in tensor_slice\n result.set_dynamic_dim_values(result_dynamic_dims)\n ...
... File "/usr/local/lib/python3.10/dist-packages/iree/turbine/aot/support/procedural/iree_emitter.py", line 285, in tensor_slice\n result.set_dynamic_dim_values(result_dynamic_dims)\n ...
... File "/usr/local/lib/python3.10/dist-packages/iree/turbine/aot/support/procedural/primitives.py", line 168, in set_dynamic_dim_values\n assert len(values) == 0, "Dynamic dims not currently supported"\nAssertionError: Dynamic dims not currently supported\n'} ...
I have used the following and tested different versions
torch 2.5.1 also tested with 2.4.1
iree-base-compiler 2.9.0
iree-base-runtime 2.9.0
iree-turbine 2.9.0 also tried with 2.5 and 2.3
python 3.10 and 3.11
rocm/dev-ubuntu-22.04:6.0.2 and 6.1
The text was updated successfully, but these errors were encountered: