Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a size limit for images used for train few-shot-vid2vid pose? #159

Open
NeaRBrotheR opened this issue Aug 2, 2022 · 0 comments

Comments

@NeaRBrotheR
Copy link

NeaRBrotheR commented Aug 2, 2022

Hello,
I tried to run command below:
python train.py --single --config configs/projects/fs_vid2vid/youtube_dancing/test.yaml

This test.yaml is from https://github.com/NVlabs/imaginaire/issues/106#issuecomment-966725785
This config and dataset both can work.
The image size of this dataset is 380*380.

But,this config doesn't work with my own dataset.
And the image size of my dataset is 1920*1080.

This possibly be an issue with image size?
Or is there some other problem involved?

Using random seed 2
Training with 1 GPUs.
Make folder logs/2022_0802_1815_50_test
2022-08-02 18:15:50.853643: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
cudnn benchmark: True
cudnn deterministic: False
Creating metadata
['images', 'poses-openpose']
Data file extensions: {'images': 'jpg', 'poses-openpose': 'json'}
Searching in dir: images
Found 1 sequences
Found 5350 files
Folder at dataset/raw/images opened.
Folder at dataset/raw/poses-openpose opened.
Num datasets: 1
Num sequences: 1
Max sequence length: 5350
Epoch length: 1
Creating metadata
['images', 'poses-openpose']
Data file extensions: {'images': 'jpg', 'poses-openpose': 'json'}
Searching in dir: images
Found 1 sequences
Found 5350 files
Folder at dataset/raw/images opened.
Folder at dataset/raw/poses-openpose opened.
Num datasets: 1
Num sequences: 1
Max sequence length: 5350
Epoch length: 1
Train dataset length: 1
Val dataset length: 1
Using random seed 2
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
	Num. of channels in the input image: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
Concatenate poses-openpose:
    ext: json
    num_channels: 3
    interpolator: None
    normalize: False
    pre_aug_ops: decode_json, convert::imaginaire.utils.visualization.pose::openpose_to_npy
    post_aug_ops: vis::imaginaire.utils.visualization.pose::draw_openpose_npy
    computed_on_the_fly: False
    is_mask: False for input.
	Num. of channels in the input label: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
	Num. of channels in the input image: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
	Num. of channels in the input image: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
	Num. of channels in the input image: 3
Initialized temporal embedding network with the reference one.
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
Concatenate poses-openpose:
    ext: json
    num_channels: 3
    interpolator: None
    normalize: False
    pre_aug_ops: decode_json, convert::imaginaire.utils.visualization.pose::openpose_to_npy
    post_aug_ops: vis::imaginaire.utils.visualization.pose::draw_openpose_npy
    computed_on_the_fly: False
    is_mask: False for input.
	Num. of channels in the input label: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
	Num. of channels in the input image: 3
Initialize net_G and net_D weights using type: xavier gain: 0.02
Using random seed 2
net_G parameter count: 91,147,294
net_D parameter count: 5,598,018
Use custom initialization for the generator.
Setup trainer.
Using automatic mixed precision training.
Augmentation policy: 
GAN mode: hinge
Perceptual loss:
	Mode: vgg19
Loss GAN                  Weight 1.0
Loss FeatureMatching      Weight 10.0
Loss Perceptual           Weight 10.0
Loss Flow                 Weight 10.0
Loss Flow_L1              Weight 10.0
Loss Flow_Warp            Weight 10.0
Loss Flow_Mask            Weight 10.0
No checkpoint found.
Epoch 0 ...
Epoch length: 1
------ Now start training 3 frames -------
Traceback (most recent call last):
  File "train.py", line 168, in <module>
    main()
  File "train.py", line 140, in main
    trainer.gen_update(
  File "/home/deepfake/fewshotvid2vid/imaginaire/imaginaire/trainers/vid2vid.py", line 254, in gen_update
    net_G_output = self.net_G(data_t)
  File "/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/deepfake/fewshotvid2vid/imaginaire/imaginaire/utils/trainer.py", line 195, in forward
    return self.module(*args, **kwargs)
  File "/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/deepfake/fewshotvid2vid/imaginaire/imaginaire/generators/fs_vid2vid.py", line 155, in forward
    self.flow_generation(label, ref_labels, ref_images,
  File "/home/deepfake/fewshotvid2vid/imaginaire/imaginaire/generators/fs_vid2vid.py", line 337, in flow_generation
    ref_image_warp = resample(ref_image, flow_ref)
  File "/home/deepfake/fewshotvid2vid/imaginaire/imaginaire/model_utils/fs_vid2vid.py", line 26, in resample
    final_grid = (grid + flow).permute(0, 2, 3, 1)
RuntimeError: The size of tensor a (910) must match the size of tensor b (912) at non-singleton dimension 3

And i tried to run command below:
python -m torch.distributed.launch --nproc_per_node=1 train.py --single --config configs/projects/fs_vid2vid/youtube_dancing/test.yaml
I got this :

/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torch.distributed.run.
Note that --use_env is set by default in torch.distributed.run.
If your script expects `--local_rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See 
https://pytorch.org/docs/stable/distributed.html#launch-utility for 
further instructions

  warnings.warn(
Using random seed 2
Training with 1 GPUs.
Make folder logs/2022_0802_1857_59_test
2022-08-02 18:57:59.670150: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1
cudnn benchmark: True
cudnn deterministic: False
Creating metadata
['images', 'poses-openpose']
Data file extensions: {'images': 'jpg', 'poses-openpose': 'json'}
Searching in dir: images
Found 1 sequences
Found 5350 files
Folder at dataset/raw/images opened.
Folder at dataset/raw/poses-openpose opened.
Num datasets: 1
Num sequences: 1
Max sequence length: 5350
Epoch length: 1
Creating metadata
['images', 'poses-openpose']
Data file extensions: {'images': 'jpg', 'poses-openpose': 'json'}
Searching in dir: images
Found 1 sequences
Found 5350 files
Folder at dataset/raw/images opened.
Folder at dataset/raw/poses-openpose opened.
Num datasets: 1
Num sequences: 1
Max sequence length: 5350
Epoch length: 1
Train dataset length: 1
Val dataset length: 1
Using random seed 2
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
	Num. of channels in the input image: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
Concatenate poses-openpose:
    ext: json
    num_channels: 3
    interpolator: None
    normalize: False
    pre_aug_ops: decode_json, convert::imaginaire.utils.visualization.pose::openpose_to_npy
    post_aug_ops: vis::imaginaire.utils.visualization.pose::draw_openpose_npy
    computed_on_the_fly: False
    is_mask: False for input.
	Num. of channels in the input label: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
	Num. of channels in the input image: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
	Num. of channels in the input image: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
	Num. of channels in the input image: 3
Initialized temporal embedding network with the reference one.
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
Concatenate poses-openpose:
    ext: json
    num_channels: 3
    interpolator: None
    normalize: False
    pre_aug_ops: decode_json, convert::imaginaire.utils.visualization.pose::openpose_to_npy
    post_aug_ops: vis::imaginaire.utils.visualization.pose::draw_openpose_npy
    computed_on_the_fly: False
    is_mask: False for input.
	Num. of channels in the input label: 3
Concatenate images:
    ext: jpg
    num_channels: 3
    normalize: True
    computed_on_the_fly: False
    is_mask: False
    pre_aug_ops: None
    post_aug_ops: None for input.
	Num. of channels in the input image: 3
Initialize net_G and net_D weights using type: xavier gain: 0.02
Using random seed 2
net_G parameter count: 91,147,294
net_D parameter count: 5,598,018
Use custom initialization for the generator.
Setup trainer.
Using automatic mixed precision training.
Augmentation policy: 
GAN mode: hinge
Perceptual loss:
	Mode: vgg19
Loss GAN                  Weight 1.0
Loss FeatureMatching      Weight 10.0
Loss Perceptual           Weight 10.0
Loss Flow                 Weight 10.0
Loss Flow_L1              Weight 10.0
Loss Flow_Warp            Weight 10.0
Loss Flow_Mask            Weight 10.0
No checkpoint found.
Epoch 0 ...
Epoch length: 1
------ Now start training 3 frames -------
Traceback (most recent call last):
  File "train.py", line 168, in <module>
    main()
  File "train.py", line 140, in main
    trainer.gen_update(
  File "/home/deepfake/fewshotvid2vid/imaginaire/imaginaire/trainers/vid2vid.py", line 254, in gen_update
    net_G_output = self.net_G(data_t)
  File "/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/deepfake/fewshotvid2vid/imaginaire/imaginaire/utils/trainer.py", line 195, in forward
    return self.module(*args, **kwargs)
  File "/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/deepfake/fewshotvid2vid/imaginaire/imaginaire/generators/fs_vid2vid.py", line 155, in forward
    self.flow_generation(label, ref_labels, ref_images,
  File "/home/deepfake/fewshotvid2vid/imaginaire/imaginaire/generators/fs_vid2vid.py", line 337, in flow_generation
    ref_image_warp = resample(ref_image, flow_ref)
  File "/home/deepfake/fewshotvid2vid/imaginaire/imaginaire/model_utils/fs_vid2vid.py", line 26, in resample
    final_grid = (grid + flow).permute(0, 2, 3, 1)
RuntimeError: The size of tensor a (910) must match the size of tensor b (912) at non-singleton dimension 3
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 19530) of binary: /home/deepfake/miniconda3/envs/imaginaire/bin/python
Traceback (most recent call last):
  File "/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
    run(args)
  File "/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/site-packages/torch/distributed/run.py", line 689, in run
    elastic_launch(
  File "/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 116, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/deepfake/miniconda3/envs/imaginaire/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 244, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
***************************************
            train.py FAILED            
=======================================
Root Cause:
[0]:
  time: 2022-08-02_18:58:18
  rank: 0 (local_rank: 0)
  exitcode: 1 (pid: 19530)
  error_file: <N/A>
  msg: "Process failed with exitcode 1"
=======================================
Other Failures:
  <NO_OTHER_FAILURES>
***************************************

They both have the issue of RuntimeError: The size of tensor a (910) must match the size of tensor b (912) at non-singleton dimension 3.
And I have the same problem using the image in 'dataset/unit_test/raw/vid2vid/pose'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant