Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Required GPU memory depends on the video length. #31

Open
ysig opened this issue Sep 26, 2023 · 1 comment
Open

Required GPU memory depends on the video length. #31

ysig opened this issue Sep 26, 2023 · 1 comment

Comments

@ysig
Copy link

ysig commented Sep 26, 2023

I've managed to run run_tokenflow_pnp.py for a small excerpt of my video (5s) - and it looks really cool - but when I run it on the full one (5min) it crashes with CUDA OOM error even when I drop the batch size down to 1.

This scaling dependence on the video length probably caused by the extended attention seems like a major limitation of the method and is not highlighted neither in the discussion section nor somewhere else in the paper (as far as I can tell).

Is it possible to offload part of the attention computation to the CPU so that the number of frames is not a bottleneck?

@eps696
Copy link

eps696 commented Oct 3, 2023

that's exactly what i did here #32 (in a way).
it handled longer sequences but not unlimited ones, as required to feed that whole attention data back to GPU before denoising latents step (and i didn't manage to make it feedable in batches on that step)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants