Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModuleNotFoundError: No module named 'torchvision.transforms.functional_tensor' #224

Open
vter00 opened this issue Mar 14, 2024 · 9 comments

Comments

@vter00
Copy link

vter00 commented Mar 14, 2024

ModuleNotFoundError: No module named 'torchvision.transforms.functional_tensor'

@DFMlaozhu
Copy link

I encountered the same problem as you, have you solved it?

@andradeofc
Copy link

I have the same problem

@arielweinberger
Copy link

Go to file /usr/local/lib/python3.10/dist-packages/basicsr/data/degradations.py and change line number 8 to:

from torchvision.transforms.functional import rgb_to_grayscale

Got the solution from AUTOMATIC1111/stable-diffusion-webui#13985

@netxor17
Copy link

Try to uninstall and then install the latest version of torchvision. (0.17.1)
pip install torchvision

@Matrix-X
Copy link

which version of torch, torchvision, torchaudio is recommended and run success on Mac M1 ?

@Matrix-X
Copy link

Try to uninstall and then install the latest version of torchvision. (0.17.1) pip install torchvision

❯ pip list | grep torch
torch 2.2.1
torchaudio 2.2.1
torchvision 0.17.1
❯ python inference.py --face examples/face/1.mp4 --audio examples/audio/1.wav --outfile results/1_1.mp4
Traceback (most recent call last):
File "inference.py", line 16, in
from third_part.GPEN.gpen_face_enhancer import FaceEnhancement
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/third_part/GPEN/gpen_face_enhancer.py", line 11, in
from utils.inference_utils import Laplacian_Pyramid_Blending_with_mask
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/utils/inference_utils.py", line 5, in
from models import load_network, load_DNet
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/models/init.py", line 2, in
from models.DNet import DNet
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/models/DNet.py", line 10, in
from models.base_blocks import LayerNorm2d, ADAINHourglass, FineEncoder, FineDecoder
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/models/base_blocks.py", line 9, in
from basicsr.archs.arch_util import default_init_weights
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/basicsr/init.py", line 4, in
from .data import *
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/basicsr/data/init.py", line 22, in
_dataset_modules = [importlib.import_module(f'basicsr.data.{file_name}') for file_name in dataset_filenames]
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/basicsr/data/init.py", line 22, in
_dataset_modules = [importlib.import_module(f'basicsr.data.{file_name}') for file_name in dataset_filenames]
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/basicsr/data/realesrgan_dataset.py", line 11, in
from basicsr.data.degradations import circular_lowpass_kernel, random_mixed_kernels
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/basicsr/data/degradations.py", line 8, in
from torchvision.transforms.functional_tensor import rgb_to_grayscale
ModuleNotFoundError: No module named 'torchvision.transforms.functional_tensor'

@Matrix-X
Copy link

which version of torch, torchvision, torchaudio is recommended and run success on Mac M1 ?

pip install torch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0

❯ python inference.py --face examples/face/1.mp4 --audio examples/audio/1.wav --outfile results/1_1.mp4
[Info] Using cpu for inference.
[Step 0] Number of frames available for inference: 135
[Step 1] Using saved landmarks.
[Step 2] 3DMM Extraction In Video:: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 135/135 [00:05<00:00, 26.32it/s]
using expression center
Load checkpoint from: checkpoints/DNet.pt
Load checkpoint from: checkpoints/LNet.pth
Load checkpoint from: checkpoints/ENet.pth
[Step 3] Using saved stabilized video.
[Step 4] Load audio; Length of mel chunks: 109
[Step 5] Reference Enhancement: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 109/109 [08:27<00:00, 4.65s/it]
landmark Det:: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 109/109 [00:46<00:00, 2.32it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 109/109 [00:00<00:00, 41943.04it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 109/109 [00:00<00:00, 1026.23it/s]
FaceDet:: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 28/28 [01:11<00:00, 2.55s/it]
[Step 6] Lip Synthesis:: 0%| | 0/7 [02:04<?, ?it/s]
Traceback (most recent call last):
File "inference.py", line 345, in
main()
File "inference.py", line 221, in main
pred, low_res = model(mel_batch, img_batch, reference)
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/models/ENet.py", line 113, in forward
low_res_img = self.low_res(audio_sequences, LNet_input)
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/models/LNet.py", line 132, in forward
_outputs = self.decoder(vis_feat, audio_feat)
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/models/LNet.py", line 73, in forward
out = res_model(out, z)
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/models/base_blocks.py", line 425, in forward
x = model(x, z)
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/models/base_blocks.py", line 404, in forward
x_l, x_g = self.conv1((x_l, x_g), z)
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/models/base_blocks.py", line 383, in forward
x_l, x_g = self.ffc(x)
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/models/ffc.py", line 231, in forward
out_xg = self.convl2g(x_l) * l2g_gate + self.convg2g(x_g)
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/models/ffc.py", line 157, in forward
output = self.fu(x)
File "/opt/anaconda3/envs/video_retalking/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/Volumes/AISpace/Workspace/DigitalHuman/video-retalking/models/ffc.py", line 99, in forward
ffted = torch.fft.rfftn(x, dim=fft_dim, norm=self.fft_norm)
RuntimeError: fft: ATen not compiled with MKL support

@tomy128
Copy link

tomy128 commented Mar 31, 2024

This is an bug for package basicsr==1.4.2, see this for details: https://github.com/XPixelGroup/BasicSR/pull/650/files

@vettorazi
Copy link

vettorazi commented Apr 3, 2024

wow, that was super sketchy... but changing the 2D thing in both files + changing how the degradation file imports tourchvision (torchvision.transforms.functional import rgb_to_grayscale) + changing the requirements.txt
worked for me! basically, what I'm saying is: if you try the first and second solves and nothing worked.. keep trying! some hack will fix this thing haha

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants