Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what's the function of the learnable positional_embedding in the class DiffusionSceneLayout_DDPM? #38

Closed
gyula-coder opened this issue Jul 19, 2024 · 3 comments

Comments

@gyula-coder
Copy link

when reading training code, I found a learnable positional_embedding which is passed to the first block of Unet1D's downs, mid_blocks, ups.

such as the code of Unet1D's downs:

    for block0, block1, attncross, block2, attn, downsample in self.downs:
        x = block0(x, context) 
        x = block1(x, t)
        h.append(x)

        x = attncross(x, context_cross) if self.text_condition else attncross(x)
        x = block2(x, t)
        x = attn(x)
        h.append(x)

        x = downsample(x)

the context is the instan_condition_f from the next code:
instance_indices = torch.arange(self.sample_num_points).long().to(self.device)[None, :].repeat(batch_size, 1)
instan_condition_f = self.positional_embedding[instance_indices, :]

I wonder the function of positional_embedding. thanks for your help.

@gyula-coder
Copy link
Author

In the initial implementation of Unet1D from 'https://github.com/lucidrains/denoising-diffusion-pytorch/blob/main/denoising_diffusion_pytorch/denoising_diffusion_pytorch_1d.py', there is no block0 and attncross. is that one of the innovations of this paper?

@tangjiapeng
Copy link
Owner

the instance embedding is to encode the position information of each instance within a sequence.

It can helps the denoiser differentiate different object instances.

@gyula-coder
Copy link
Author

Thank you very much for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants