Skip to content

Commit

Permalink
Merge branch 'cgroupv2', and bump version to v0.8.0
Browse files Browse the repository at this point in the history
Signed-off-by: Miao Wang <[email protected]>
  • Loading branch information
shankerwangmiao committed Sep 2, 2021
2 parents 1804a31 + 98fcb62 commit c07aaff
Show file tree
Hide file tree
Showing 18 changed files with 1,033 additions and 104 deletions.
179 changes: 173 additions & 6 deletions .github/workflows/tunasync.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,10 @@ jobs:
runs-on: ubuntu-latest
steps:

- name: Set up Go 1.13
- name: Set up Go 1.16
uses: actions/setup-go@v1
with:
go-version: 1.13
go-version: 1.16
id: go

- name: Check out code into the Go module directory
Expand All @@ -37,6 +37,11 @@ jobs:
test:
name: Test
runs-on: ubuntu-latest
services:
registry:
image: registry:2
ports:
- 5000:5000
steps:

- name: Setup test dependencies
Expand All @@ -48,22 +53,184 @@ jobs:
sudo cgcreate -a $USER -t $USER -g cpu:tunasync
sudo cgcreate -a $USER -t $USER -g memory:tunasync
- name: Set up Go 1.13
- name: Set up Go 1.16
uses: actions/setup-go@v1
with:
go-version: 1.13
go-version: 1.16
id: go

- name: Check out code into the Go module directory
uses: actions/checkout@v2

- name: Run Unit tests.
run: make test
run: |
go install github.com/wadey/gocovmerge@latest
TERM=xterm-256color make test
- name: Run Additional Unit tests.
run: |
make build-test-worker
sudo cgexec -g "*:/" bash -c "echo 0 > /sys/fs/cgroup/systemd/tasks; exec sudo -u $USER env USECURCGROUP=1 TERM=xterm-256color cgexec -g cpu,memory:tunasync ./worker.test -test.v=true -test.coverprofile profile2.cov -test.run TestCgroup"
touch /tmp/dummy_exec
chmod +x /tmp/dummy_exec
run_test_reexec (){
case="$1"
shift
argv0="$1"
shift
(TESTREEXEC="$case" TERM=xterm-256color exec -a "$argv0" ./worker.test -test.v=true -test.coverprofile "profile5_$case.cov" -test.run TestReexec -- "$@")
}
run_test_reexec 1 tunasync-exec __dummy__
run_test_reexec 2 tunasync-exec /tmp/dummy_exec
run_test_reexec 3 tunasync-exec /tmp/dummy_exec 3< <(echo -n "abrt")
run_test_reexec 4 tunasync-exec /tmp/dummy_exec 3< <(echo -n "cont")
run_test_reexec 5 tunasync-exec2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
with:
driver-opts: network=host
- name: Cache Docker layers
uses: actions/cache@v2
if: github.event_name == 'push'
with:
path: /tmp/.buildx-cache
key: ${{ runner.os }}-buildx-${{ github.sha }}
restore-keys: |
${{ runner.os }}-buildx-
- name: Cache Docker layers
uses: actions/cache@v2
if: github.event_name == 'pull_request'
with:
path: /tmp/.buildx-cache
key: ${{ runner.os }}-pr-${{ github.event.pull_request.head.user.login }}-buildx-${{ github.sha }}
restore-keys: |
${{ runner.os }}-pr-${{ github.event.pull_request.head.user.login }}-buildx-
${{ runner.os }}-buildx-
- name: Cache Docker layers
if: github.event_name != 'push' && github.event_name != 'pull_request'
run: |
echo "I do not know how to setup cache"
exit -1
- name: Prepare cache directory
run: |
mkdir -p /tmp/.buildx-cache
- name: Build Docker image for uml rootfs
uses: docker/build-push-action@v2
with:
context: .umlrootfs
file: .umlrootfs/Dockerfile
push: true
tags: localhost:5000/umlrootfs
cache-from: type=local,src=/tmp/.buildx-cache
cache-to: type=local,dest=/tmp/.buildx-cache

- name: Fetch and install uml package
run: |
sudo apt-get update
sudo apt-get install -y debian-archive-keyring
sudo ln -sf /usr/share/keyrings/debian-archive-keyring.gpg /etc/apt/trusted.gpg.d/
echo "deb http://deb.debian.org/debian buster main" | sudo tee /etc/apt/sources.list.d/buster.list
sudo apt-get update
apt-get download user-mode-linux/buster
sudo rm /etc/apt/sources.list.d/buster.list
sudo apt-get update
sudo mv user-mode-linux_*.deb /tmp/uml.deb
sudo apt-get install --no-install-recommends -y /tmp/uml.deb
sudo rm /tmp/uml.deb
sudo apt-get install --no-install-recommends -y rsh-redone-client
- name: Prepare uml environment
run: |
docker container create --name umlrootfs localhost:5000/umlrootfs
sudo mkdir -p umlrootfs
docker container export umlrootfs | sudo tar -xv -C umlrootfs
docker container rm umlrootfs
sudo cp -a --target-directory=umlrootfs/lib/ /usr/lib/uml/modules
/bin/echo -e "127.0.0.1 localhost\n254.255.255.1 host" | sudo tee umlrootfs/etc/hosts
sudo ip tuntap add dev umltap mode tap
sudo ip addr add 254.255.255.1/24 dev umltap
sudo ip link set umltap up
- name: Start Uml
run: |
start_uml () {
sudo bash -c 'linux root=/dev/root rootflags=/ rw rootfstype=hostfs mem=2G eth0=tuntap,umltap hostfs="$PWD/umlrootfs" con1=pts systemd.unified_cgroup_hierarchy=1 & pid=$!; echo "UMLINUX_PID=$pid" >> '"$GITHUB_ENV"
}
( start_uml )
started=0
for i in $(seq 1 60); do
if ping -c 1 -w 1 254.255.255.2; then
started=1
break
fi
done
if [ "$started" != "1" ]; then
echo "Failed to wait Umlinux online"
exit 1
fi
- name: Prepare Uml Environment
run: |
CUSER="$(id --user --name)"
CUID="$(id --user)"
CGID="$(id --group)"
sudo chroot umlrootfs bash --noprofile --norc -eo pipefail << EOF
groupadd --gid "${CGID?}" "${CUSER?}"
useradd --create-home --home-dir "/home/${CUSER}" --gid "${CGID?}" \
--uid "${CUID?}" --shell "\$(which bash)" "${CUSER?}"
EOF
ln ./worker.test "umlrootfs/home/${CUSER}/worker.test"
- name: Run Tests in Cgroupv2
run: |
CUSER="$(id --user --name)"
sudo rsh 254.255.255.2 bash --noprofile --norc -eo pipefail << EOF
cd "/home/${CUSER}"
mkdir -p /sys/fs/cgroup/tunasync
TERM=xterm-256color ./worker.test -test.v=true -test.coverprofile \
profile3.cov -test.run TestCgroup
rmdir /sys/fs/cgroup/tunasync
systemd-run --service-type=oneshot --uid="${CUSER}" --pipe --wait \
--property=Delegate=yes --setenv=USECURCGROUP=1 \
--setenv=TERM=xterm-256color --same-dir \
"\${PWD}/worker.test" -test.v=true -test.coverprofile \
profile4.cov -test.run TestCgroup
EOF
- name: Stop Uml
run: |
sudo rsh 254.255.255.2 systemctl poweroff
sleep 10
if [ -e "/proc/$UMLINUX_PID" ]; then
sleep 10
if [ -e "/proc/$UMLINUX_PID" ]; then
sudo kill -TERM "$UMLINUX_PID" || true
sleep 1
fi
fi
if [ -e "/proc/$UMLINUX_PID" ]; then
sleep 10
if [ -e "/proc/$UMLINUX_PID" ]; then
sudo kill -KILL "$UMLINUX_PID" || true
sleep 1
fi
fi
- name: Combine coverage files
run : |
CUSER="$(id --user --name)"
"${HOME}/go/bin/gocovmerge" profile.cov profile2.cov \
"umlrootfs/home/${CUSER}/profile3.cov" \
"umlrootfs/home/${CUSER}/profile4.cov" \
profile5_*.cov > profile-all.cov
- name: Convert coverage to lcov
uses: jandelgado/[email protected]
with:
infile: profile.cov
infile: profile-all.cov
outfile: coverage.lcov

- name: Coveralls
Expand Down
13 changes: 13 additions & 0 deletions .umlrootfs/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
FROM debian:buster
RUN apt-get update && apt-get install -y systemd rsh-redone-server ifupdown sudo kmod
RUN echo "host" > /root/.rhosts && \
chmod 600 /root/.rhosts && \
/bin/echo -e "auto eth0\niface eth0 inet static\naddress 254.255.255.2/24" > /etc/network/interfaces.d/eth0 && \
sed -i '/pam_securetty/d' /etc/pam.d/rlogin && \
cp /usr/share/systemd/tmp.mount /etc/systemd/system && \
systemctl enable tmp.mount

RUN echo "deb http://deb.debian.org/debian experimental main" >> /etc/apt/sources.list && \
apt-get update && \
apt-get install -y make && \
apt-get install -y -t experimental cgroup-tools
5 changes: 4 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,7 @@ $(BUILDBIN:%=build-$(ARCH)/%) : build-$(ARCH)/% : cmd/%
test:
go test -v -covermode=count -coverprofile=profile.cov ./...

.PHONY: all test $(BUILDBIN)
build-test-worker:
go test -c -covermode=count ./worker

.PHONY: all test $(BUILDBIN) build-test-worker
5 changes: 5 additions & 0 deletions cmd/tunasync/tunasync.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"github.com/pkg/profile"
"gopkg.in/op/go-logging.v1"
"github.com/urfave/cli"
"github.com/moby/moby/pkg/reexec"

tunasync "github.com/tuna/tunasync/internal"
"github.com/tuna/tunasync/manager"
Expand Down Expand Up @@ -109,6 +110,10 @@ func startWorker(c *cli.Context) error {

func main() {

if reexec.Init() {
return
}

cli.VersionPrinter = func(c *cli.Context) {
var builddate string
if buildstamp == "" {
Expand Down
141 changes: 141 additions & 0 deletions docs/cgroup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# About Tunasync and cgroup

Optionally, tunasync can be integrated with cgroup to have better control and tracking processes started by mirror jobs. Also, limiting memory usage of a mirror job also requires cgroup support.

## How cgroup are utilized in tunasync?

If cgroup are enabled globally, all the mirror jobs, except those running in docker containers, are run in separate cgroups. If `mem_limit` is specified, it will be applied to the cgroup. For jobs running in docker containers, `mem_limit` is applied via `docker run` command.


## Tl;dr: What's the recommended configuration?

### If you are using v1 (legacy, hybrid) cgroup hierarchy:

`tunasync-worker.service`:

```
[Unit]
Description = TUNA mirrors sync worker
After=network.target
[Service]
Type=simple
User=tunasync
PermissionsStartOnly=true
ExecStartPre=/usr/bin/cgcreate -t tunasync -a tunasync -g memory:tunasync
ExecStart=/home/bin/tunasync worker -c /etc/tunasync/worker.conf --with-systemd
ExecReload=/bin/kill -SIGHUP $MAINPID
ExecStopPost=/usr/bin/cgdelete memory:tunasync
[Install]
WantedBy=multi-user.target
```

`worker.conf`:

``` toml
[cgroup]
enable = true
group = "tunasync"
```

### If you are using v2 (unified) cgroup hierarchy:

`tunasync-worker.service`:

```
[Unit]
Description = TUNA mirrors sync worker
After=network.target
[Service]
Type=simple
User=tunasync
ExecStart=/home/bin/tunasync worker -c /etc/tunasync/worker.conf --with-systemd
ExecReload=/bin/kill -SIGHUP $MAINPID
Delegate=yes
[Install]
WantedBy=multi-user.target
```

`worker.conf`:

``` toml
[cgroup]
enable = true
```


## Two versions of cgroups

Due to various of reasons, there are two versions of cgroups in the kernel, which are incompatible with each other. Most of the current linux distributions adopts systemd as the init system, which relies on cgroup and is responsible for initializing cgroup. As a result, the selection of the version of cgroups is mainly decided by systemd. Since version 243, the "unified" cgroup hierarchy setup has become the default.

Tunasync can automatically detect which version of cgroup is in use and enable the corresponding operating interface, but due to the fact that systemd behaves slightly differently in the two cases, different configurations for tunasync are recomended.

## Two modes of group name discovery

Two modes of group name discovery are provided: implicit mode and manual mode.

### Manual Mode

In this mode, the administrator should 1. manually create an empty cgroup (for cgroup v2 unified hierarchy) or empty cgroups in certain controller subsystems with the same name (for cgroup v1 hybird hierarchy); 2. change the ownership of the cgroups to the running user of the tunasync worker; and 3. specify the path in the configuration. On start, tunasync will automatically detect which controllers are enabled (for v1) or enable needed controllers (for v2).

Example 1:

``` bash
# suppose we have cgroup v1
sudo mkdir -p /sys/fs/cgroup/cpu/test/tunasync
sudo mkdir -p /sys/fs/cgroup/memory/test/tunasync
sudo chown -R tunasync:tunasync /sys/fs/cgroup/cpu/test/tunasync
sudo chown -R tunasync:tunasync /sys/fs/cgroup/memory/test/tunasync

# in worker.conf, we have group = "/test/tunasync" or "test/tunasync"
tunasync worker -c /path/to/worker.conf
```

In the above scenario, tunasync will detect the enabled subsystem controllers are cpu and memory. When running a mirror job named `foo`, sub-cgroups will be created in both `/sys/fs/cgroup/cpu/test/tunasync/foo` and `/sys/fs/cgroup/memory/test/tunasync/foo`.

Example 2 (not recommended):

``` bash
# suppose we have cgroup v2
sudo mkdir -p /sys/fs/cgroup/test/tunasync
sudo chown -R tunasync:tunasync /sys/fs/cgroup/test/tunasync

# in worker.conf, we have group = "/test/tunasync" or "test/tunasync"
tunasync worker -c /path/to/worker.conf
```

In the above scenario, tunasync will directly use the cgroup `/sys/fs/cgroup/test/tunasync`. In most cases, due to the design of cgroupv2, since tunasync is not running as root, tunasync won't have the permission to move the processes it starts to the correct cgroup. That's because cgroup2 requires the operating process should also have the write permission of the common ancestor of the source group and the target group when moving processes between groups. So this example is only for demonstration of the functionality and you should prevent it.

### Implicit mode

In this mode, tunasync will use the cgroup it is currently running in and create sub-groups for jobs in that group. Tunasync will first create a sub-group named `__worker` in that group, and move itself in the `__worker` sub-group, to prevent processes in non-leaf cgroups.

Mostly, this mode is cooperated with the `Delegate=yes` option of the systemd service configuration of tunasync, which will permit the running process to self-manage the cgroup the service in running in. Due to security considerations, systemd won't give write permissions of the current running cgroups to the service when using v1 (legacy, hybrid) cgroup hierarchy and non-root user, so it is more meaningful to use this mode with v2 cgroup hierarchy.


## Configruation

``` toml
[cgroup]
enable = true
base_path = "/sys/fs/cgroup"
group = "tunasync"
subsystem = "memory"
```

The defination of the above options is:

* `enable`: `Bool`, specifies whether cgroup is enabled. When cgroup is disabled, `memory_limit` for non-docker jobs will be ignored, and the following options are also ignored.
* `group`: `String`, specifies the cgroup tunasync will use. When not provided, or provided with empty string, cgroup discovery will work in "Implicit mode", i.e. will create sub-cgroups in the current running cgroup. Otherwise, cgroup discovery will work in "Manual mode", where tunasync will create sub-cgroups in the specified cgroup.
* `base_path`: `String`, ignored. It originally specifies the mounting path of cgroup filesystem, but for making everything work, it is now required that the cgroup filesystem should be mounted at its default path(`/sys/fs/cgroup`).
* `subsystem `: `String`, ignored. It originally specifies which cgroupv1 controller is enabled and now becomes meaningless since the discovery is now automatic.

## References:

* [https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html]()
* [https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/index.html]()
* [https://systemd.io/CGROUP_DELEGATION/]()
* [https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html#Delegate=]()
Loading

0 comments on commit c07aaff

Please sign in to comment.