Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtual DSM in a Docker container inside an unprivileged LXC container #382

Open
databreach opened this issue Nov 16, 2023 · 33 comments
Open

Comments

@databreach
Copy link
Contributor

I wanted to share insights into why "virtual-dsm" encounters challenges running within an unprivileged Proxmox LXC container by default and how the provided script addresses these issues.

Core Challenges:

  1. Device Access: The default setup assumes access to certain device nodes, a luxury not available (by default) in the constrained environment of an unprivileged LXC container.
  2. mknod Operations: "virtual-dsm" relies on mknod operations during its setup process, which is problematic within the restricted context of an unprivileged LXC.
  3. LXC Configuration: Certain LXC configurations need adjustment to accommodate the specific requirements of "virtual-dsm."

Script Solutions:

  1. Device Configuration: The script creates essential device nodes within the LXC container, overcoming device access limitations.
  2. mknod Operations Bypass: The script modifies the "virtual-dsm" code to circumvent mknod errors, ensuring a smoother setup within the unprivileged LXC environment.
  3. LXC Configuration Adjustments: The script fine-tunes the LXC container configuration to align with the expectations of "virtual-dsm."

Executing the script below on your Proxmox host (not within the LXC container) will pave the way for a more seamless "virtual-dsm" experience within an unprivileged LXC on Proxmox.

bash -c "$(wget -qLO - https://raw.githubusercontent.com/databreach/virtual-dsm-lxc/main/virtual-dsm-lxc.sh)"

Feel free to delve into the script for a detailed understanding, and don't hesitate to share your insights or report any observations.

Best,
databreach

@kroese
Copy link
Collaborator

kroese commented Nov 16, 2023

Wow, thank you very much! This will be helpful for a lot of LXC users, and I will include a link in the readme file.

But it will be even better to also modify the container so that it will give the least amount of errors when running under LXC, even without any special preparations like your script does.

You said that you received errors regarding mknod but as far as I remember there is not a single mknod in my code that is really required. Their only purpose is for when users forget to include a device (like /dev/net/tun and others) in their compose file, that it will be created automaticly via mknod. So they only act as a fail-safe.

If you let me know which errors you received when running under LXC, I will take a look at them and see if we can handle them differently. Because I have a suspicion that most of them can be solved in the compose-file without the need for a preparation script.

@databreach
Copy link
Contributor Author

It would be highly advantageous if you could devise a method to ensure seamless execution of "virtual-dsm" within an unprivileged LXC container, requiring no specific preconditions. The errors encountered when running the script under an unprivileged LXC container are as follows:

Reproducible Steps:

With a fresh installation of a Debian 12 Standard LXC container and default installation of Docker:
docker: Error response from daemon: error gathering device information while adding custom device "/dev/net/tun": no such file or directory.

After adding lxc.mount.entry: /dev/net/tun dev/net/tun none bind,create=file 0 0" to the LXC configuration:
docker: Error response from daemon: error gathering device information while adding custom device "/dev/kvm": no such file or directory.

After adding lxc.mount.entry: /dev/kvm dev/kvm none bind,create=file 0 0" to the LXC configuration:
docker: Error response from daemon: error gathering device information while adding custom device "/dev/vhost-net": no such file or directory.

After adding lxc.mount.entry: /dev/vhost-net dev/vhost-net none bind,create=file 0 0 to the LXC configuration:

Install: Downloading installer...
cpio: dev/net/tun: Cannot mknod: Operation not permitted
36797 blocks
ERROR: Failed to cpio /storage/dsm.rd, reason 2

After changing cpio -idm <"$TMP/rd" code of install.sh to ignore mknod errors:

Install: Extracting system partition...
tar: dev/net/tun: Cannot mknod: Operation not permitted

After changing tar xpfJ "$HDA.txz" code of install.sh to ignore mknod errors:

Install: Installing system partition...
ERROR: KVM acceleration not detected (no write access), see the FAQ about this.

After chown 100000:100000 /dev/net/tun + /dev/kvm + /dev/vhost-net on the Proxmox host:
Success!

Note: The script does not make use of the Proxmox host device nodes. Instead, new device nodes are created (using mknod) to secure the Proxmox host. The "virtual-dsm" device nodes can be found on the Proxmox host in the "/dev-CT ID" folder. These device nodes are also used in the lxc.mount.entry of the LXC configuration.

Docker Commands:

docker run -it --rm -p 5000:5000 --cap-add NET_ADMIN --device-cgroup-rule='c *:* rwm' --device /dev/net/tun --device /dev/kvm --device /dev/vhost-net --stop-timeout 60 -e ALLOCATE=N virtual-dsm:latest

version: "3" services: vdsm: container_name: vdsm image: virtual-dsm:latest environment: CPU_CORES: "2" DISK_SIZE: "6G" RAM_SIZE: "4096M" ALLOCATE: "N" devices: - /dev/kvm - /dev/vhost-net - /dev/net/tun device_cgroup_rules: - 'c *:* rwm' cap_add: - NET_ADMIN ports: - 5000:5000 volumes: - /storage:/storage restart: on-failure stop_grace_period: 1m tty: true

@kroese
Copy link
Collaborator

kroese commented Nov 16, 2023

Thanks, I will look into these ASAP.

Because right now you are patching the sourcecode with fixes. But if those parts of the code in this repository change in the future, those patches will stop working.

So it would be much better if you submitted those patches as pull-requests (provided they cause no problems for non-LXC users), so that they become available for everyone.

@kroese
Copy link
Collaborator

kroese commented Nov 17, 2023

I tried to implement your patches but I was getting all kinds of warnings from shellcheck. For example, that $errors is declared inside the subshell () so its value cannot be read outside the subshell (a few lines further). Also I cannot read the exitcodes anymore if the commands fail, because it takes the exitcode from grep, etc.

I can see from your script that you are a much better Bash coder than I am, as the quality is miles ahead. But it's difficult for me to accept the patches if I don't fully understand how it works with all the pipes to grep. For example, it checks if the output contains mknod and continues in that case. But I'm worried that if there are more errors than just the mknod alone, it will also continue. Maybe your code takes care of that, but since I'm such a novice at Bash it's hard for me to tell.

So my next attempt, was to solve it differently. For example, with the tar command it's possible to add a parameter: --exclude='/dev/net/tun' that also prevents the error without having to use grep at all. My hope was that DSM would re-create these missing devices automaticly on first boot, but unfortunately it did only do that for /dev/console but not for /dev/net/tun. And I'm not sure what the implications are with this device missing in DSM, it could be that it breaks some Synology packages that require it.

So I was unable to find a way to implement your fixes while still being sure that they don't cause any negative side-effects for people that do not need them.

kroese added a commit that referenced this issue Nov 17, 2023
feat: Control device nodes #382
@databreach
Copy link
Contributor Author

databreach commented Nov 17, 2023

The initial script was a quick fix, altering the source code to enable "virtual-dsm" to operate within an unprivileged LXC container. However, a more refined solution has been developed, eliminating the need for such workarounds and ensuring compatibility with diverse use cases. A pull request for this improvement has been submitted.

Enhancements:

The pull request introduces a crucial enhancement: the ability to control the creation of device nodes. This feature empowers users to run "virtual-dsm" seamlessly within a Docker container inside an unprivileged LXC container, all without the necessity of modifying the source code.

Usage Note:

If users prefer or need to create the device nodes manually (e.g., in the case of an unprivileged LXC container), they can utilize the Docker environment variable DEV="N". This flexibility ensures a more versatile application.

Device Configuration:

Please note that for successful execution, the tun, kvm, and vhost-net devices must be established on the Proxmox host. These devices should be granted LXC ownership and subsequently added as mount entries to their respective LXC containers. The initial script has been updated to simplify and streamline this process.

bash -c "$(wget -qLO - https://raw.githubusercontent.com/databreach/virtual-dsm-lxc/main/virtual-dsm-lxc.sh)"

@kroese
Copy link
Collaborator

kroese commented Nov 17, 2023

Thanks! I merged this improved version, looks good! But it did not include any workaround for /dev/tap devices in network.sh like it did previously. Without this DHCP mode will not work?

@databreach
Copy link
Contributor Author

Thank you for merging the improved version, and I appreciate your keen observation. You're correct; the workaround for /dev/tap devices in network.sh was not included in this pull request. After extensive testing, it became apparent that the DHCP mode is currently non-functional inside an unprivileged LXC container.

Even with the previously employed workaround, I encounter a persistent issue marked by the following error message:
char device redirected to /dev/pts/0 (label charserial0) qemu-system-x86_64: -netdev tap,id=hostnet0,vhost=on,vhostfd=40,fd=30: Unable to query TUNGETIFF on FD 30: Inappropriate ioctl for device.

Regrettably, this indicates that DHCP (macvlan) remains non-operational for the time being when the application runs within an unprivileged LXC container. I will try to identify and resolve the issue, but your expertise on this front (qemu) would be invaluable to pinpoint the root cause of this matter. If you have any insights or suggestions, they would be greatly appreciated.

@kroese
Copy link
Collaborator

kroese commented Nov 19, 2023

Yes, that is an annoying limitation of QEMU that it communicates with those devices via a file-descriptor, instead of by device name. So instead of specifying that you want to use /dev/tap2 you must open a handle to the file and pass that handle. If unprivileged LXC does not support the ioctl commands needed to get the handle, there is not much we can do about that I'm afraid.

However, this macvtap device is only needed to passthrough the network adapter to make DHCP possible. But you can still use macvlan via the tuntap bridge device. So it's still possible to give the container it's own separate IP address. Its just that the traffic will be tunneled via NAT instead of the more direct approach that macvtap offers.

@BliXem1
Copy link

BliXem1 commented Dec 5, 2023

Hi,

Sorry for the noob question here, what is the right process to make this work?

1: Create a LXC container with debian 12 + docker installed.
2: On the proxmox node run this: bash -c "$(wget -qLO - https://raw.githubusercontent.com/databreach/virtual-dsm-lxc/main/virtual-dsm-lxc.sh)"
3: Install virtual-dsm via docker and add "DEV="N" and then run it within the LXC container?

Again, sorry for the noob question.

kroese added a commit that referenced this issue Dec 6, 2023
feat: Control device nodes #382
@databreach
Copy link
Contributor Author

databreach commented Dec 6, 2023

Usage

  1. Create a LXC container in Proxmox with your preferred distribution (e.g. template Debian 12 Standard).
  2. On the Proxmox node/host execute the virtual-dsm-lxc script (requires root privileges).
  3. Install Docker and run virtual-dsm within the LXC container.

Example

In this example we will setup virtual-dsm using multiple storage locations. The LXC disk will be used for installing Debian OS and storing the virtual-dsm Docker image. The first mount point will be used for installing virtual-dsm and is using the fast storage pool named nvme. The second mount point will be used for storing data within virtual-dsm and is using the slow storage pool named sata. Change the storage pools and disk sizes based on your own environment and preferences.

1. Create LXC container using Proxmox UI

  • General: Ensure Unprivileged container and nesting are enabled.
  • Template: Select debian-12-standard_12.2-1_amd64.tar.zst.
  • Disks: Allocate 8 of Disk size (GiB) for rootfs.
  • Disks: Press Add and allocate at least 16 of Disk size (GiB). In Path set /vdsm/storage1.
  • Disks: Press Add again and allocate 2048 of Disk size (GiB). In Path set /vdsm/storage2.
  • CPU: Allocate 2 Cores.
  • Memory: Allocate 4096 of Memory (MiB) and 512 of Swap (MiB).
  • Network: Set Static IPv4 or use DHCP for the router assign an IP address to the LXC container.
  • DNS: Leave as is or adjust to your requirements.
  • Confirm: Ensure Start after created is disabled. Press Finish.

2. Execute the virtual-dsm-lxc script

  • Access the shell of the Proxmox node/host via the Proxmox UI or SSH. Ensure you have root privileges.
  • Execute bash -c "$(wget -qLO - https://raw.githubusercontent.com/databreach/virtual-dsm-lxc/main/virtual-dsm-lxc.sh)"
  • Press Y to continue.
  • Enter the LXC Container ID of the container created above (e.g. 101).

3. Install Docker and run virtual-dsm

  • Start the LXC Container created above and login to its console via the Proxmox UI or SSH.
  • Install Docker (e.g. Install Docker Engine on Debian)
  • Run virtual-dsm in docker using the mount points created: docker run -it --rm -p 5000:5000 --cap-add NET_ADMIN --device-cgroup-rule='c *:* rwm' --sysctl net.ipv4.ip_forward=1 --device /dev/net/tun --device /dev/kvm --device /dev/vhost-net --stop-timeout 60 -v /vdsm/storage1:/storage -v /vdsm/storage2:/storage2 -e CPU_CORES=2 -e RAM_SIZE=4096M -e DISK_SIZE=16G -e DISK2_SIZE=2T -e DISK_FMT=qcow2 -e ALLOCATE=N vdsm/virtual-dsm:latest.

4. Verify installation of virtual-dsm

  • Access virtual-dsm via the Internet browser of your choice: <IP address of LXC container:5000>.
  • Complete the DSM on-screen configuration.
  • Open Storage Manager from the main menu.
  • There should be a volume 1 (reported as ~15.3 GB) and a volume 2 (reported as ~1.9 TB).

Edits

  • Added --sysctl net.ipv4.ip_forward=1 to support IP forwarding.
  • Replaced -e ALLOCATE=N with new disk feature -e DISK_FMT=qcow2.
  • Re-added -e ALLOCATE=N to be used in combination with qcow2.
  • Removed obsolete DEV=N parameter.

@BliXem1
Copy link

BliXem1 commented Dec 6, 2023

Wow! Thanks! Great doc! :-). I enabled GPU passtrough and added the right things to the conf within proxmox to enable passtrough. But, I'm running to the error that card128 chmod 666 can't be executed because of some permissions. Any fix for that?

I think this will fix that issue: --device-cgroup-rule='c : rwm', didn't do that yet.

@BliXem1
Copy link

BliXem1 commented Dec 6, 2023

If I follow your step by step instructions, I get this message:

sysctl: permission denied on key "net.ipv4.ip_forward"
❯ ERROR: Please add the following docker setting to your container: --sysctl net.ipv4.ip_forward=1

Above fixed but still the other error:

root@Synology-Docker:~# docker run -it --rm -p 5000:5000 --cap-add=NET_ADMIN --device-cgroup-rule='c *:* rwm' --device /dev/net/tun --device /dev/kvm --device /dev/vhost-net --device /dev/dri --sysctl net.ipv4.ip_forward=1 --stop-timeout 60 -v /vdsm/storage1:/storage -v /vdsm/storage2:/storage2 -e CPU_CORES=3 -e RAM_SIZE=4096M -e DISK_SIZE=25G -e DISK2_SIZE=1125G -e ALLOCATE=N -e DEV=N -e GPU=Y vdsm/virtual-dsm:latest
❯ Starting Virtual DSM for Docker v4.40...
❯ For support visit https://github.com/vdsm/virtual-dsm/
chmod: changing permissions of '/dev/dri/card0': Operation not permitted
❯ ERROR: Status 1 while: chmod 666 /dev/dri/card0 (line 18/13)

Extra info on the proxmox server:

features: nesting=1
hostname: Synology-Docker
memory: 8192
mp0: local-lvm:vm-105-disk-1,mp=/vdsm/storage1,backup=1,size=25G
mp1: local-lvm:vm-105-disk-2,mp=/vdsm/storage2,backup=1,size=1125G
net0: name=eth0,bridge=vmbr0,gw=192.168.0.1,hwaddr=BC:24:11:4B:11:42,ip=192.168.0.90/24,type=veth
ostype: debian
rootfs: local-lvm:vm-105-disk-0,size=8G
swap: 0
unprivileged: 1
lxc.mount.entry: /dev-105/net/tun dev/net/tun none bind,create=file 0 0
lxc.mount.entry: /dev-105/kvm dev/kvm none bind,create=file 0 0
lxc.mount.entry: /dev-105/vhost-net dev/vhost-net none bind,create=file 0 0
lxc.cgroup2.devices.allow: c 226:0 rwm
lxc.cgroup2.devices.allow: c 226:128 rwm
lxc.cgroup2.devices.allow: c 29:0 rwm
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
lxc.cgroup2.devices.allow: c 120:0 rwm

@databreach
Copy link
Contributor Author

databreach commented Dec 6, 2023

If I follow your step by step instructions, I get this message:

sysctl: permission denied on key "net.ipv4.ip_forward"
❯ ERROR: Please add the following docker setting to your container: --sysctl net.ipv4.ip_forward=1

Thank you for reporting this. Missed that one as I have IP forwarding enabled on my test Proxmox host. The guide as been updated to include the IP forwarding argument.

@BliXem1
Copy link

BliXem1 commented Dec 6, 2023

If I follow your step by step instructions, I get this message:

sysctl: permission denied on key "net.ipv4.ip_forward"
❯ ERROR: Please add the following docker setting to your container: --sysctl net.ipv4.ip_forward=1

Thank you for reporting this. Missed that one as I have IP forwarding enabled on my test Proxmox host.

No problem. I hope you can also help me with this?

root@Synology-Docker:~# docker run -it --rm -p 5000:5000 --cap-add=NET_ADMIN --device-cgroup-rule='c *:* rwm' --device /dev/net/tun --device /dev/kvm --device /dev/vhost-net --device /dev/dri --sysctl net.ipv4.ip_forward=1 --stop-timeout 60 -v /vdsm/storage1:/storage -v /vdsm/storage2:/storage2 -e CPU_CORES=3 -e RAM_SIZE=4096M -e DISK_SIZE=25G -e DISK2_SIZE=1125G -e ALLOCATE=N -e DEV=N -e GPU=Y vdsm/virtual-dsm:latest
❯ Starting Virtual DSM for Docker v4.40...
❯ For support visit https://github.com/vdsm/virtual-dsm/
chmod: changing permissions of '/dev/dri/card0': Operation not permitted
❯ ERROR: Status 1 while: chmod 666 /dev/dri/card0 (line 18/13)

Extra info:

features: nesting=1
hostname: Synology-Docker
memory: 8192
mp0: local-lvm:vm-105-disk-1,mp=/vdsm/storage1,backup=1,size=25G
mp1: local-lvm:vm-105-disk-2,mp=/vdsm/storage2,backup=1,size=1125G
net0: name=eth0,bridge=vmbr0,gw=192.168.0.1,hwaddr=BC:24:11:4B:11:42,ip=192.168.0.90/24,type=veth
ostype: debian
rootfs: local-lvm:vm-105-disk-0,size=8G
swap: 0
unprivileged: 1
lxc.mount.entry: /dev-105/net/tun dev/net/tun none bind,create=file 0 0
lxc.mount.entry: /dev-105/kvm dev/kvm none bind,create=file 0 0
lxc.mount.entry: /dev-105/vhost-net dev/vhost-net none bind,create=file 0 0
lxc.cgroup2.devices.allow: c 226:0 rwm
lxc.cgroup2.devices.allow: c 226:128 rwm
lxc.cgroup2.devices.allow: c 29:0 rwm
lxc.mount.entry: /dev/fb0 dev/fb0 none bind,optional,create=file
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
lxc.mount.entry: /dev/dri/renderD128 dev/dri/renderD128 none bind,optional,create=file
lxc.cgroup2.devices.allow: c 120:0 rwm

vainfo giving me this:

libva info: VA-API version 1.17.0
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_17
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.17 (libva 2.12.0)
vainfo: Driver version: Intel iHD driver for Intel(R) Gen Graphics - 23.1.1 ()
vainfo: Supported profile and entrypoints

@databreach
Copy link
Contributor Author

databreach commented Dec 6, 2023

Above fixed but still the other error:

root@Synology-Docker:~# docker run -it --rm -p 5000:5000 --cap-add=NET_ADMIN --device-cgroup-rule='c *:* rwm' --device /dev/net/tun --device /dev/kvm --device /dev/vhost-net --device /dev/dri --sysctl net.ipv4.ip_forward=1 --stop-timeout 60 -v /vdsm/storage1:/storage -v /vdsm/storage2:/storage2 -e CPU_CORES=3 -e RAM_SIZE=4096M -e DISK_SIZE=25G -e DISK2_SIZE=1125G -e ALLOCATE=N -e DEV=N -e GPU=Y vdsm/virtual-dsm:latest
❯ Starting Virtual DSM for Docker v4.40...
❯ For support visit https://github.com/vdsm/virtual-dsm/
chmod: changing permissions of '/dev/dri/card0': Operation not permitted
❯ ERROR: Status 1 while: chmod 666 /dev/dri/card0 (line 18/13)

I do not use GPU / vGPU passthrough on LXC containers. However, I have created a modified version of the script which should make the card and renderD accessible in the LXC container. This should fix your permission errors.

Important: GPU / vGPU passthrough on LXC containers requires configuration on both the Proxmox host as the LXC container. Success depends on settings, drivers and GPU compatibility.

The following steps differ from the guide above:

2. Execute the virtual-dsm-lxc script

  • Access the shell of the Proxmox node/host via the Proxmox UI or SSH. Ensure you have root privileges.
  • Execute bash -c "$(wget -qLO - https://raw.githubusercontent.com/databreach/virtual-dsm-lxc/main/virtual-dsm-lxc-gpu.sh)"
  • Press Y to continue.
  • Enter the LXC Container ID of the container created above (e.g. 101).
  • Enter the GPU / vGPU card ID (e.g. for card1 you would enter 1).
  • Enter the GPU / vGPU renderD ID (e.g. for renderD129 you would enter 129).

3. Install Docker and run virtual-dsm

  • Start the LXC Container created above and login to its console via the Proxmox UI or SSH.
  • Install Docker (e.g. Install Docker Engine on Debian)
  • Run virtual-dsm in docker using the mount points created: docker run -it --rm -p 5000:5000 --cap-add NET_ADMIN --device-cgroup-rule='c *:* rwm' --sysctl net.ipv4.ip_forward=1 --device /dev/net/tun --device /dev/kvm --device /dev/vhost-net --device /dev/dri --stop-timeout 60 -v /vdsm/storage1:/storage -v /vdsm/storage2:/storage2 -e CPU_CORES=2 -e RAM_SIZE=4096M -e DISK_SIZE=16G -e DISK2_SIZE=2T -e DISK_FMT=qcow2 -e ALLOCATE=N -e GPU=Y vdsm/virtual-dsm:latest

Edits

  • Replaced -e ALLOCATE=N with new disk feature -e DISK_FMT=qcow2.
  • Re-added -e ALLOCATE=N to be used in combination with qcow2.
  • Removed obsolete DEV=N parameter.

@BliXem1
Copy link

BliXem1 commented Dec 6, 2023

Yes, that worked! Thanks for your great work!

@BliXem1
Copy link

BliXem1 commented Dec 11, 2023

GPU passtrough doesn't work for me because I can't find:

/dev/dri on the synology container.

GPU=Y and it's passtrough via Proxmox. If you know something to fix this, let me know.

arch: amd64
cores: 4
features: nesting=1
hostname: Synology-Docker
memory: 8192
mp0: local-lvm:vm-105-disk-1,mp=/vdsm/storage1,backup=1,size=25G
mp1: local-lvm:vm-105-disk-2,mp=/vdsm/storage2,backup=1,size=1125G
net0: name=eth0,bridge=vmbr0,gw=192.168.0.1,hwaddr=BC:24:11:4B:11:42,ip=192.168.0.90/24,type=veth
ostype: debian
rootfs: local-lvm:vm-105-disk-0,size=8G
swap: 0
unprivileged: 1
lxc.mount.entry: /dev-105/net/tun dev/net/tun none bind,create=file 0 0
lxc.mount.entry: /dev-105/kvm dev/kvm none bind,create=file 0 0
lxc.mount.entry: /dev-105/vhost-net dev/vhost-net none bind,create=file 0 0
lxc.mount.entry: /dev-105/dri/card0 dev/dri/card0 none bind,create=file 0 0
lxc.mount.entry: /dev-105/dri/renderD128 dev/dri/renderD128 none bind,create=file 0 0

@kroese
Copy link
Collaborator

kroese commented Dec 11, 2023

@BliXem1 Do you mean you dont have /dev/dri inside DSM? That is normal.. It doesnt mean that it doesnt work. If you list the devices with lspci or something, you will see that you have a GPU device available and it will be used by DSM. If not, see #234 or #334

If instead you mean that you have no /dev/dri on the host, then you should ask on the Proxmox forums.

@BliXem1
Copy link

BliXem1 commented Dec 11, 2023

Alright, got the card within the container:

00:02.0 VGA compatible controller: Intel Corporation Alder Lake-P GT1 [UHD Graphics] (rev 0c)

Then it's not possible what I thought. I need /dev/dri for Jellyfin within docker. But yeah, not possible :)

@kroese
Copy link
Collaborator

kroese commented Dec 11, 2023

Maybe its possible by installing some extra module or drivers inside DSM that creates /dev/dri, i am not sure.

But GPU=Y only makes sure that DSM can see and access the card. It does not have influence over what DSM does with it after that.

@databreach
Copy link
Contributor Author

Alright, got the card within the container:

00:02.0 VGA compatible controller: Intel Corporation Alder Lake-P GT1 [UHD Graphics] (rev 0c)

Then it's not possible what I thought. I need /dev/dri for Jellyfin within docker. But yeah, not possible :)

To rule out a non-working GPU inside DSM, you could check if face recognition and/or video thumbnails are working.

@kroese
Copy link
Collaborator

kroese commented Dec 23, 2023

@databreach In your edits you wrote:

Replaced -e ALLOCATE=N with new disk feature -e DISK_FMT=qcow2

and that worked a couple of versions ago because the allocation flag did not apply to qcow2.

But in recent versions allocation support for qcow2 was added, so now you will need to set also -e ALLOCATE=N or you will get a preallocated qcow2 image.

@databreach
Copy link
Contributor Author

The guides have been updated. Thank you!

@Stan-Gobien
Copy link

However, this macvtap device is only needed to passthrough the network adapter to make DHCP possible. But you can still use macvlan via the tuntap bridge device. So it's still possible to give the container it's own separate IP address. Its just that the traffic will be tunneled via NAT instead of the more direct approach that macvtap offers.

If I want to use macvlan how would I create the network using tuntap?

Currently I create the network on the LXC debian 12 with:
docker network create -d macvlan --subnet=192.168.1.0/24 --gateway=192.168.1.254 -o parent=eth0 macvlan1

In my compose I have

version: "3"
services:
    virtualdsm02:
        container_name: virtualdsm02
        image: vdsm/virtual-dsm:latest
        environment:
            DISK_SIZE: "16G"
            CPU_CORES: "2"
            RAM_SIZE: "1024M" 
            ALLOCATE: "N"
            DISK_FMT: "qcow2"   
            DHCP: "Y"
        device_cgroup_rules:
            - 'c *:* rwm'
        sysctls:
            - net.ipv4.ip_forward=1             
        devices:
            - /dev/kvm
            - /dev/vhost-net
            - /dev/net/tun

        networks:
          macvlan1:
        cap_add:
            - NET_ADMIN                       
        ports:
            - 5000:5000
            - 5001:5001
        volumes:
            -  /virtualdsm02/storage1:/storage
        restart: on-failure
        stop_grace_period: 2m
        labels:
           - "com.centurylinklabs.watchtower.enable=true"               
networks:
    macvlan1:
        external: true        

This currently fails with the mknod /dev/tap2 error.

@kroese
Copy link
Collaborator

kroese commented Jan 11, 2024

@Stan-Gobien That is because you have set DHCP=Y. You should set it to DHCP=N if you want to use tuntap.

@Stan-Gobien
Copy link

Stan-Gobien commented Jan 11, 2024

Okay, so the only drawback is no DHCP assigned address then?
And I have to specify an address in the compose like so?

     networks:
          macvlan1:
            ipv4_address: 192.168.1.180

Edit: Did a test without the ipv4_address line and it does assign a random address in the range.
But in DSM itself you see a NAT ip and not the real one.
So that's the drawback I guess.

@kroese
Copy link
Collaborator

kroese commented Jan 11, 2024

Yes, you need to specify an address like that.

The drawback is that you don't have DHCP and that the VM will have its traffic tunneled between an internal address (20.20.20.21) and the outside world. So DSM will not know its "real" address, but that should not be a problem for the average user.

Also you can remove this

        ports:
            - 5000:5000
            - 5001:5001

section from your compose file. Because its not needed when you use macvlan.

@Stan-Gobien
Copy link

This works very well. Splendid project!

@frezeen
Copy link

frezeen commented Feb 17, 2024

i follow instructions, all is ok. a problem only ... proxmox backups and proxmox backup server dont work. i cant backups the CT when dsm is running.

please , can someone confirm this or have a fix?

@toxic0berliner
Copy link

For backups I sometimes had it with other LXC, my solution was to do a backup in stop/start instead of snapshot, recreating the LXC several months later solved my issue somehow...

I see people are getting it working inside LXC, I was wondering if any of you were able to use mountpoints inside DSM. I was looking for virtual DSM to keep using hyper backup but it seems it refuses to use mounted filesystems as source of the backup and as such I only have one option left which is to pass the data from my PvE host to the LXC as mountpoint and then hopefully pass it on to DSM making it believe it is local storage. I fear that's going to be impossible but I wanted to ask the LXC DSM specialists here...

@sergey-1976
Copy link

I have a server without KVM support. I only use LXC. I get an error when starting the container:

root@DSM1:~# docker run -it --rm -p 5000:5000 --cap-add NET_ADMIN --device-cgroup-rule='c : rwm' --sysctl net.ipv4.ip_forward=1 --device /dev/net/tun --device /dev/vhost-net --stop-timeout 60 -v /vdsm/storage1:/storage -v /vdsm/storage2:/storage2 -e CPU_CORES=2 -e RAM_SIZE=4096M -e DISK_SIZE=16G -e DISK2_SIZE=100G -e DISK_FMT=qcow2 -e ALLOCATE=N vdsm/virtual-dsm:latest KVM=NO
❯ Starting Virtual DSM for Docker v7.01...
❯ For support visit https://github.com/vdsm/virtual-dsm

❯ ERROR: KVM acceleration not available (device file missing), this will cause a major loss of performance.
❯ ERROR: See the FAQ on how to enable it, or continue without KVM by setting KVM=N (not recommended).

@frezeen
Copy link

frezeen commented Mar 29, 2024

try:

docker run -it --rm -p 5000:5000 --cap-add NET_ADMIN --device-cgroup-rule='c : rwm' --sysctl net.ipv4.ip_forward=1 --device /dev/net/tun --device /dev/vhost-net --stop-timeout 60 -v /vdsm/storage1:/storage -v /vdsm/storage2:/storage2 -e CPU_CORES=2 -e RAM_SIZE=4096M -e DISK_SIZE=16G -e DISK2_SIZE=100G -e DISK_FMT=qcow2 -e ALLOCATE=N -e KVM=NO vdsm/virtual-dsm:latest

@sergey-1976
Copy link

Everything works inside the container. I can connect to [local address]:5000. I installed Plex and Emby - they are installed, but I can't connect - address [local address]:8096/web/index.html - ERR_CONNECTION_REFUSED. I have enabled the Samba server, but on Windows PC the address \[local address] - is not detected. The Proxmox firewall is disabled. Is there anything else that needs to be done?

@levi2m levi2m mentioned this issue Jun 22, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants