Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dmabuf-wayland wakes up discrete NVIDIA GPU #13668

Closed
Kimiblock opened this issue Mar 8, 2024 · 11 comments · Fixed by #14028
Closed

dmabuf-wayland wakes up discrete NVIDIA GPU #13668

Kimiblock opened this issue Mar 8, 2024 · 11 comments · Fixed by #14028
Labels

Comments

@Kimiblock
Copy link

Kimiblock commented Mar 8, 2024

Important Information

Provide following Information:

  • mpv version: 0.38.0
  • Linux Distribution and Version: Arch Linux, Rolling
  • Source of the mpv binary: the Arch Linux repo
  • If known which version of mpv introduced the problem: 0.37.0, older version unknown
  • Window Manager and version: KWin 6.0.1, Mutter 46
  • GPU model, driver and version: Intel Alder Lake iGPU + NVIDIA discrete GPU
  • Possible screenshot or video of visual glitches

If you're not using git master or the latest release, update.
Releases are listed here: https://github.com/mpv-player/mpv/releases

Reproduction steps

On an hybrid GPU system, run

mpv --no-config [File] --vo=dmabuf-wayland --hwdec=vaapi --gpu-hwdec-interop=vaapi

and run

watch cat /sys/class/drm/card*/device/power_state

to monitor GPU power state.

Expected behavior

mpv does not wake up the discrete GPU: watch command will return a line D3cold

Actual behavior

mpv wakes up discrete GPU, watch shows all GPUs at D0, sudo lsof /dev/nvidia* shows that mpv is using discrete GPU.

Also tried --drm-device=/dev/dri/renderD129 to point drm device to iGPU but it doesn't work.

Log file

output.txt

Sample files

Any VP9 / AV1 / HEVC / AVC video

@llyyr
Copy link
Contributor

llyyr commented Mar 8, 2024

Use --vaapi-device instead, --drm-device only affects drm backends, though this will only affect video decoding. The rendering will still be done on the same GPU your compositor started on

@Kimiblock
Copy link
Author

Thanks for your help~

I've tried --vaapi-device=/dev/dri/renderD129 but it doesn't seem to work.

(And as I recall NVIDIA doesn't support VAAPI by default?)

@BellaCoola
Copy link

I can also reproduce the issue on 2 different laptops with Intel + NVIDIA GPU.

@heddxh
Copy link

heddxh commented Apr 17, 2024

I can also reproduce the issue with Intel + NVIDIA GPU, on Arch Linux, kde wayland

@stt
Copy link

stt commented Apr 20, 2024

had a similar issue of mpv always taking 3sec+ to launch while it was waking up suspended GPUs even when just listing --help, it also took 3sec to quit if the other GPU got suspended again. I wasn't actually using dmabuf-wayland so my specific issue was different but perhaps there's a connection.

with strace -c mpv it shows

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- -----------------------
 98.54    3.188032        2295      1389       104 openat
  0.41    0.013260           3      3330           read

with strace -o mpv.log -r mpv the syscalls around this slow one looked to be happening in nvidia code not in mpv (references to /dev/nvidiactl etc). google brought me to NVIDIA/egl-wayland#89 which itself was fixed in nvidia 550.40.07 but they mention similar issue affecting Vulkan, just enumerating the devices wakes them up.

then I tested switching gpu-api=opengl which still woke up the other GPUs but now it only when playing a video and not when listing --help etc (with that change the slowest syscall was a futex).

then noticed that setting vo=vdpau in mpv.conf fixed my immediate issue of mpv waking up suspended GPUs so I stopped trying to debug it any further but hopefully this helps someone else dealing with similar things.

@Kimiblock
Copy link
Author

Kimiblock commented Apr 20, 2024 via email

@heddxh
Copy link

heddxh commented Apr 23, 2024

Btw, it won't happen when using nouveau. log is little different: https://fars.ee/VtIr

@jrelvas-ipc
Copy link
Contributor

I can replicate this issue. I believe I've narrowed down exactly when mpv wakes up the nvidia dgpu.

The wake up happens when mpv is ran with the --vo=dmabuf-wayland option.

dmabuf-wayland seems to try loading several hwdec drivers (regardless if one was manually specified with the --hwdec option).

One of the decoders it tries to load is cuda, as seen in logs:

[vo/dmabuf-wayland] Loading hwdec driver 'cuda'
[vo/dmabuf-wayland/cuda] CUDA hwdec only works with OpenGL or Vulkan backends.
[vo/dmabuf-wayland] Loading failed.

The cuda driver appears to be what wakes up the nvidia dgpu. The log explaining that the cuda driver only works with opengl/vulkan backends is only logged a few seconds after, precisely when the nvidia gpu stops being suspended.

There's two possible fixes here:

  • Do not let the dmabuf-wayland vo load hwdec drivers which it doesn't support (including cuda).
  • Do not query cuda/gpus in the cuda hwdec driver before checking the backend.

This is also why @heddxh could not replicate the issue with Nouveau. CUDA isn't supported by Nouveau, so whatever things the cuda hwdec driver is calling never wake up the dgpu.

@jrelvas-ipc
Copy link
Contributor

The gpu is awoken when cuda_hwdec calls cuInit.

@jrelvas-ipc
Copy link
Contributor

I've made a patch which fixes this. Submitting soon.

jrelvas-ipc added a commit to jrelvas-ipc/mpv that referenced this issue Apr 30, 2024
`cuInit` wakes up the nvidia dgpu on nvidia laptops. This is bad news because the wake up process
is blocking and takes a few seconds. It also needlessly increases power consumption.

Sometimes, a VO loads several hwdecs (like `dmabuf_wayland`). When `cuda` is loaded, it calls
`cuInit` before running all interop inits. However, the first checks in the interops do not
require cuda initialization, so we only need to call `cuInit` after those checks.

`cuInit` is handled by the new `cuda_priv_init` function. It ensures `cuInit` is only called once.

With these changes, there's no cuda initialization if no OpenGL/Vulkan backend is available. This prevents
`dmabuf_wayland` and other VOs which automatically load cuda from waking up the nvidia dgpu unnecessarily,
making them start faster and decreasing power consumption on laptops.

Fixes: mpv-player#13668

Signed-off-by: Jrelvas <[email protected]>
@jrelvas-ipc
Copy link
Contributor

jrelvas-ipc commented Apr 30, 2024

I've done a little testing and this appears to fix the issue. #14028

jrelvas-ipc added a commit to jrelvas-ipc/mpv that referenced this issue Apr 30, 2024
`cuInit` wakes up the nvidia dgpu on nvidia laptops. This is bad news because the wake up process
is blocking and takes a few seconds. It also needlessly increases power consumption.

Sometimes, a VO loads several hwdecs (like `dmabuf_wayland`). When `cuda` is loaded, it calls
`cuInit` before running all interop inits. However, the first checks in the interops do not
require cuda initialization, so we only need to call `cuInit` after those checks.

`cuInit` is handled by the new `cuda_priv_init` function. It ensures `cuInit` is only called once.

With these changes, there's no cuda initialization if no OpenGL/Vulkan backend is available. This prevents
`dmabuf_wayland` and other VOs which automatically load cuda from waking up the nvidia dgpu unnecessarily,
making them start faster and decreasing power consumption on laptops.

Fixes: mpv-player#13668

Signed-off-by: Jrelvas <[email protected]>
jrelvas-ipc added a commit to jrelvas-ipc/mpv that referenced this issue Apr 30, 2024
`cuInit` wakes up the nvidia dgpu on nvidia laptops. This is bad news because the wake up process
is blocking and takes a few seconds. It also needlessly increases power consumption.

Sometimes, a VO loads several hwdecs (like `dmabuf_wayland`). When `cuda` is loaded, it calls
`cuInit` before running all interop inits. However, the first checks in the interops do not
require cuda initialization, so we only need to call `cuInit` after those checks.

This commit splits the interop `init` function into `check` and `init`. `check` can be called without
initializing the Cuda backend, so cuInit is only called *after* the first interop check.

With these changes, there's no cuda initialization if no OpenGL/Vulkan backend is available. This prevents
`dmabuf_wayland` and other VOs which automatically load cuda from waking up the nvidia dgpu unnecessarily,
making them start faster and decreasing power consumption on laptops.

Fixes: mpv-player#13668
jrelvas-ipc added a commit to jrelvas-ipc/mpv that referenced this issue May 2, 2024
`cuInit` wakes up the nvidia dgpu on nvidia laptops. This is bad news because the wake up process
is blocking and takes a few seconds. It also needlessly increases power consumption.

Sometimes, a VO loads several hwdecs (like `dmabuf_wayland`). When `cuda` is loaded, it calls
`cuInit` before running all interop inits. However, the first checks in the interops do not
require cuda initialization, so we only need to call `cuInit` after those checks.

This commit splits the interop `init` function into `check` and `init`. `check` can be called without
initializing the Cuda backend, so cuInit is only called *after* the first interop check.

With these changes, there's no cuda initialization if no OpenGL/Vulkan backend is available. This prevents
`dmabuf_wayland` and other VOs which automatically load cuda from waking up the nvidia dgpu unnecessarily,
making them start faster and decreasing power consumption on laptops.

Fixes: mpv-player#13668
jrelvas-ipc added a commit to jrelvas-ipc/mpv that referenced this issue May 2, 2024
`cuInit` wakes up the nvidia dgpu on nvidia laptops. This is bad news because the wake up process
is blocking and takes a few seconds. It also needlessly increases power consumption.

Sometimes, a VO loads several hwdecs (like `dmabuf_wayland`). When `cuda` is loaded, it calls
`cuInit` before running all interop inits. However, the first checks in the interops do not
require cuda initialization, so we only need to call `cuInit` after those checks.

This commit splits the interop `init` function into `check` and `init`. `check` can be called without
initializing the Cuda backend, so cuInit is only called *after* the first interop check.

With these changes, there's no cuda initialization if no OpenGL/Vulkan backend is available. This prevents
`dmabuf_wayland` and other VOs which automatically load cuda from waking up the nvidia dgpu unnecessarily,
making them start faster and decreasing power consumption on laptops.

Fixes: mpv-player#13668
jrelvas-ipc added a commit to jrelvas-ipc/mpv that referenced this issue May 2, 2024
`cuInit` wakes up the nvidia dgpu on nvidia laptops. This is bad news because the wake up process
is blocking and takes a few seconds. It also needlessly increases power consumption.

Sometimes, a VO loads several hwdecs (like `dmabuf_wayland`). When `cuda` is loaded, it calls
`cuInit` before running all interop inits. However, the first checks in the interops do not
require cuda initialization, so we only need to call `cuInit` after those checks.

This commit splits the interop `init` function into `check` and `init`. `check` can be called without
initializing the Cuda backend, so cuInit is only called *after* the first interop check.

With these changes, there's no cuda initialization if no OpenGL/Vulkan backend is available. This prevents
`dmabuf_wayland` and other VOs which automatically load cuda from waking up the nvidia dgpu unnecessarily,
making them start faster and decreasing power consumption on laptops.

Fixes: mpv-player#13668
jrelvas-ipc added a commit to jrelvas-ipc/mpv that referenced this issue May 2, 2024
`cuInit` wakes up the nvidia dgpu on nvidia laptops. This is bad news because the wake up process
is blocking and takes a few seconds. It also needlessly increases power consumption.

Sometimes, a VO loads several hwdecs (like `dmabuf_wayland`). When `cuda` is loaded, it calls
`cuInit` before running all interop inits. However, the first checks in the interops do not
require cuda initialization, so we only need to call `cuInit` after those checks.

This commit splits the interop `init` function into `check` and `init`. `check` can be called without
initializing the Cuda backend, so cuInit is only called *after* the first interop check.

With these changes, there's no cuda initialization if no OpenGL/Vulkan backend is available. This prevents
`dmabuf_wayland` and other VOs which automatically load cuda from waking up the nvidia dgpu unnecessarily,
making them start faster and decreasing power consumption on laptops.

Fixes: mpv-player#13668
kasper93 pushed a commit that referenced this issue May 6, 2024
`cuInit` wakes up the nvidia dgpu on nvidia laptops. This is bad news because the wake up process
is blocking and takes a few seconds. It also needlessly increases power consumption.

Sometimes, a VO loads several hwdecs (like `dmabuf_wayland`). When `cuda` is loaded, it calls
`cuInit` before running all interop inits. However, the first checks in the interops do not
require cuda initialization, so we only need to call `cuInit` after those checks.

This commit splits the interop `init` function into `check` and `init`. `check` can be called without
initializing the Cuda backend, so cuInit is only called *after* the first interop check.

With these changes, there's no cuda initialization if no OpenGL/Vulkan backend is available. This prevents
`dmabuf_wayland` and other VOs which automatically load cuda from waking up the nvidia dgpu unnecessarily,
making them start faster and decreasing power consumption on laptops.

Fixes: #13668
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants